You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@solr.apache.org by Mike <mz...@gmail.com> on 2022/07/04 09:01:32 UTC

Solr eats up all the memory

Hello!

My Solr index size is around 500GB and I have 64GB of RAM. Solr eats up all
the memory and because of that PHP works very, very slowly. What can I do?

Thanks

Mike

Re: Solr eats up all the memory

Posted by Gus Heck <gu...@gmail.com>.
Search generally trades memory/disk to achieve speed. Thus it tends to use
the available JVM memory, and it also benefits greatly from excess memory
that the OS can dedicate to caching disk information. For this reason,
while it is certainly *possible* to run solr on the same machine as your
PHP server it's a suboptimal solution from the perspective of getting the
most out of Solr. In addition, if your php server is serving a user
interface to end users, and exposed to the internet, hosting solr on it
means you have to be very careful about security, because *by necessary
design* the admin features have the ability to do things like delete your
index or store arbitrary data in your index. These features are necessary
to provide the powerful functionality that solr provides, and are meant to
be protected either by properly configuring the security features in solr
or by carefully sequestering solr behind a firewall and/or proxy.

If the server hosting solr can be reached directly from the internet,
that's one less barrier for attackers. Typically solr is run on a separate
internal only server both for performance reasons and to simplify security
concerns.

While one can "go for it" if one has lots of memory it's worth noting that
large memory can lead to very long GC pauses too. So unless you need to do
heavyweight analytics/stats/sorts etc. more small machines are better than
one large one (assuming you care about controlling your maximum latency).

-Gus

On Mon, Jul 4, 2022 at 3:31 PM Dave <ha...@gmail.com> wrote:

> Also for $115 I can buy a terabyte of a Samsung ssd, which helps a lot. It
> comes to a point where money on hardware will outweigh money on engineering
> man power hours, and still come to the same conclusion. As much ram as your
> rack can take and as big and fast of a raid ssd drive it can take. Remember
> since solr is always meant to be destroyed and recreated you don’t have to
> worry much about hardware failure if you just buy two of everything and
> have a backup server ready and waiting to take over while the original
> fails and is reconstructed.
>
> > On Jul 4, 2022, at 1:32 PM, Shawn Heisey <ap...@elyograg.org> wrote:
> >
> > On 7/4/22 03:01, Mike wrote:
> >> My Solr index size is around 500GB and I have 64GB of RAM. Solr eats up
> all
> >> the memory and because of that PHP works very, very slowly. What can I
> do?
> >
> > Solr is a Java program.  A Java program will never directly use more
> memory than you specify for the max heap size.  We cannot make any general
> recommendations about what heap size you need, because there is a good
> chance that any recommendation we make would be completely wrong for your
> install.  I did see that someone recommended not going above 31G ... and
> this is good advice.  At 32 GB, Java switches to 64-bit pointers instead of
> 32-bit.  So a heap size of 32 GB actually has LESS memory available than a
> heap size of 31 GB.
> >
> > The OS will use additional memory beyond the heap for caching the index
> data, but that is completely outside of Solr's control. Note that 64GB
> total memory for a 500GB index is almost certainly not enough memory,
> ESPECIALLY if the same server is used for things other than Solr.  I wrote
> the following wiki page:
> >
> > https://cwiki.apache.org/confluence/display/SOLR/SolrPerformanceProblems
> >
> > Others have recommended that you run Solr on dedicated hardware that is
> not used for any other purpose.  I concur with that recommendation.
> >
> > Thanks,
> > Shawn
> >
>


-- 
http://www.needhamsoftware.com (work)
http://www.the111shift.com (play)

Re: Solr eats up all the memory

Posted by Christopher Schultz <ch...@christopherschultz.net>.
Deepak,

On 7/5/22 07:29, Deepak Goel wrote:
> I would also suggest you look at which GC mechanism you use. Increasing RAM
> and Heap-Size might result in the application freezed for a long time
> (during GC).

Stop-the-world long-pause GCs have been dead for a long long time.

-chris

> On Tue, Jul 5, 2022 at 4:43 PM Dave <ha...@gmail.com> wrote:
> 
>> Exactly. You could have the best engineer on your continent but the end
>> result is the same, more metal.  I could build a very fast search server
>> for less than a week of my salary so what’s the point of wasting two weeks
>> trying to solve a problem when the solution is literally just right there,
>> a big ssd and a lot of memory, then I could work on things that are
>> actually important rather than try to squeeze blood from a turnip.
>>
>>> On Jul 5, 2022, at 6:11 AM, Charlie Hull <
>> chull@opensourceconnections.com> wrote:
>>>
>>> I think you're missing my point.
>>>
>>> Good engineers, even mediocre ones, are expensive, and the great ones
>> are rare. It's a tedious task chasing tiny performance gains when you know
>> you're limited by the hardware and a bored engineer might just go and look
>> for another job. So if you fail to realise that a capital expense for
>> hardware or hosting is necessary, you run the risk of losing the people
>> that make your search engine work (even if they stay they could also be
>> doing something more useful to your business).
>>>
>>> Charlie
>>>
>>>> On 05/07/2022 10:49, Deepak Goel wrote:
>>>> If you are tearing your hair out on 'Number of Hours' required for
>> tuning
>>>> your software, it's time you switch to a better quality performance
>>>> engineer.
>>>>
>>>> Deepak
>>>> "The greatness of a nation can be judged by the way its animals are
>> treated
>>>> - Mahatma Gandhi"
>>>>
>>>> +91 73500 12833
>>>> deicool@gmail.com
>>>>
>>>> Facebook:https://www.facebook.com/deicool
>>>> LinkedIn:www.linkedin.com/in/deicool
>>>>
>>>> "Plant a Tree, Go Green"
>>>>
>>>> Make In India :http://www.makeinindia.com/home
>>>>
>>>>
>>>> On Tue, Jul 5, 2022 at 3:12 PM Charlie Hull<
>> chull@opensourceconnections.com>
>>>> wrote:
>>>>
>>>>> Equally it's not a good management practice to burn engineering hours
>>>>> trying to optimise performance to avoid spending (often much less)
>> money
>>>>> on sufficient hardware to do the job. I've seen this happen many times,
>>>>> sadly.
>>>>>
>>>>> Charlie
>>>>>
>>>>> On 05/07/2022 10:33, Deepak Goel wrote:
>>>>>> Not a good software engineering practice to beef up the hardware
>> blindly.
>>>>>> Of Course when you have tuned the software to a point where you can't
>>>>> tune
>>>>>> anymore, you can then turn your eyes to hardware.
>>>>>>
>>>>>> Deepak
>>>>>> "The greatness of a nation can be judged by the way its animals are
>>>>> treated
>>>>>> - Mahatma Gandhi"
>>>>>>
>>>>>> +91 73500 12833
>>>>>> deicool@gmail.com
>>>>>>
>>>>>> Facebook:https://www.facebook.com/deicool
>>>>>> LinkedIn:www.linkedin.com/in/deicool
>>>>>>
>>>>>> "Plant a Tree, Go Green"
>>>>>>
>>>>>> Make In India :http://www.makeinindia.com/home
>>>>>>
>>>>>>
>>>>>> On Tue, Jul 5, 2022 at 1:01 AM Dave<ha...@gmail.com>
>>>>> wrote:
>>>>>>> Also for $115 I can buy a terabyte of a Samsung ssd, which helps a
>> lot.
>>>>> It
>>>>>>> comes to a point where money on hardware will outweigh money on
>>>>> engineering
>>>>>>> man power hours, and still come to the same conclusion. As much ram
>> as
>>>>> your
>>>>>>> rack can take and as big and fast of a raid ssd drive it can take.
>>>>> Remember
>>>>>>> since solr is always meant to be destroyed and recreated you don’t
>> have
>>>>> to
>>>>>>> worry much about hardware failure if you just buy two of everything
>> and
>>>>>>> have a backup server ready and waiting to take over while the
>> original
>>>>>>> fails and is reconstructed.
>>>>>>>
>>>>>>>> On Jul 4, 2022, at 1:32 PM, Shawn Heisey<ap...@elyograg.org>
>>   wrote:
>>>>>>>>
>>>>>>>> On 7/4/22 03:01, Mike wrote:
>>>>>>>>> My Solr index size is around 500GB and I have 64GB of RAM. Solr
>> eats
>>>>> up
>>>>>>> all
>>>>>>>>> the memory and because of that PHP works very, very slowly. What
>> can I
>>>>>>> do?
>>>>>>>> Solr is a Java program.  A Java program will never directly use more
>>>>>>> memory than you specify for the max heap size.  We cannot make any
>>>>> general
>>>>>>> recommendations about what heap size you need, because there is a
>> good
>>>>>>> chance that any recommendation we make would be completely wrong for
>>>>> your
>>>>>>> install.  I did see that someone recommended not going above 31G ...
>> and
>>>>>>> this is good advice.  At 32 GB, Java switches to 64-bit pointers
>>>>> instead of
>>>>>>> 32-bit.  So a heap size of 32 GB actually has LESS memory available
>>>>> than a
>>>>>>> heap size of 31 GB.
>>>>>>>> The OS will use additional memory beyond the heap for caching the
>> index
>>>>>>> data, but that is completely outside of Solr's control. Note that
>> 64GB
>>>>>>> total memory for a 500GB index is almost certainly not enough memory,
>>>>>>> ESPECIALLY if the same server is used for things other than Solr.  I
>>>>> wrote
>>>>>>> the following wiki page:
>>>>>
>> https://cwiki.apache.org/confluence/display/SOLR/SolrPerformanceProblems
>>>>>>>> Others have recommended that you run Solr on dedicated hardware
>> that is
>>>>>>> not used for any other purpose.  I concur with that recommendation.
>>>>>>>> Thanks,
>>>>>>>> Shawn
>>>>>>>>
>>>>> --
>>>>> Charlie Hull - Managing Consultant at OpenSource Connections Limited
>>>>> Founding member of The Search Network<http://www.thesearchnetwork.com>
>>>>> and co-author of Searching the Enterprise
>>>>> <
>>>>>
>> https://opensourceconnections.com/wp-content/uploads/2020/08/ES_book_final_journal_version.pdf
>>>>> tel/fax: +44 (0)8700 118334
>>>>> mobile: +44 (0)7767 825828
>>>>>
>>>>> OpenSource Connections Europe GmbH | Pappelallee 78/79 | 10437 Berlin
>>>>> Amtsgericht Charlottenburg | HRB 230712 B
>>>>> Geschäftsführer: John M. Woodell | David E. Pugh
>>>>> Finanzamt: Berlin Finanzamt für Körperschaften II
>>>>>
>>>>> --
>>>>> This email has been checked for viruses by AVG.
>>>>> https://www.avg.com
>>>>>
>>> --
>>> Charlie Hull - Managing Consultant at OpenSource Connections Limited
>>> Founding member of The Search Network <http://www.thesearchnetwork.com>
>> and co-author of Searching the Enterprise <
>> https://opensourceconnections.com/wp-content/uploads/2020/08/ES_book_final_journal_version.pdf
>>>
>>> tel/fax: +44 (0)8700 118334
>>> mobile: +44 (0)7767 825828
>>>
>>> OpenSource Connections Europe GmbH | Pappelallee 78/79 | 10437 Berlin
>>> Amtsgericht Charlottenburg | HRB 230712 B
>>> Geschäftsführer: John M. Woodell | David E. Pugh
>>> Finanzamt: Berlin Finanzamt für Körperschaften II
>>>
>>> --
>>> This email has been checked for viruses by AVG.
>>> https://www.avg.com
>>
> 

Re: Solr eats up all the memory

Posted by Deepak Goel <de...@gmail.com>.
On Thu, Jul 7, 2022 at 3:31 AM Christopher Schultz <
chris@christopherschultz.net> wrote:

> Deepak,
>
> On 7/5/22 09:30, Deepak Goel wrote:
> > Do you mind telling us which java version are you using?
>
> It doesn't matter. Xms == Xmx is always a good idea for server
> processes. For running Keystore Explorer, let the JVM manage the heap
> size fluctuations. For servers, tell the JVM what you want.
>

Chris, I am asking for the java version because there are many more
parameters to tune the GC other than (Xms == Xmx)


> -chris
>
> > On Tue, 5 Jul 2022, 17:55 Dave, <ha...@gmail.com> wrote:
> >
> >> Also a good rule of thumb I found is set your xmx and xms, maximum and
> >> minimum memory for the heap to be exactly the same, you don’t want Java
> to
> >> try to figure it out,
> >>
> >>> On Jul 5, 2022, at 7:52 AM, Ritvik Sharma <ri...@gmail.com>
> wrote:
> >>>
> >>> Just Add java memory parameters in solr config which should not be
> more
> >>> than 75% of total RAM. and use G1GC.
> >>>
> >>>
> >>>> On Tue, 5 Jul 2022 at 4:59 PM, Deepak Goel <de...@gmail.com> wrote:
> >>>>
> >>>> I would also suggest you look at which GC mechanism you use.
> Increasing
> >> RAM
> >>>> and Heap-Size might result in the application freezed for a long time
> >>>> (during GC).
> >>>>
> >>>> Deepak
> >>>> "The greatness of a nation can be judged by the way its animals are
> >> treated
> >>>> - Mahatma Gandhi"
> >>>>
> >>>> +91 73500 12833
> >>>> deicool@gmail.com
> >>>>
> >>>> Facebook: https://www.facebook.com/deicool
> >>>> LinkedIn: www.linkedin.com/in/deicool
> >>>>
> >>>> "Plant a Tree, Go Green"
> >>>>
> >>>> Make In India : http://www.makeinindia.com/home
> >>>>
> >>>>
> >>>>> On Tue, Jul 5, 2022 at 4:43 PM Dave <ha...@gmail.com>
> >> wrote:
> >>>>>
> >>>>> Exactly. You could have the best engineer on your continent but the
> end
> >>>>> result is the same, more metal.  I could build a very fast search
> >> server
> >>>>> for less than a week of my salary so what’s the point of wasting two
> >>>> weeks
> >>>>> trying to solve a problem when the solution is literally just right
> >>>> there,
> >>>>> a big ssd and a lot of memory, then I could work on things that are
> >>>>> actually important rather than try to squeeze blood from a turnip.
> >>>>>
> >>>>>> On Jul 5, 2022, at 6:11 AM, Charlie Hull <
> >>>>> chull@opensourceconnections.com> wrote:
> >>>>>>
> >>>>>> I think you're missing my point.
> >>>>>>
> >>>>>> Good engineers, even mediocre ones, are expensive, and the great
> ones
> >>>>> are rare. It's a tedious task chasing tiny performance gains when you
> >>>> know
> >>>>> you're limited by the hardware and a bored engineer might just go and
> >>>> look
> >>>>> for another job. So if you fail to realise that a capital expense for
> >>>>> hardware or hosting is necessary, you run the risk of losing the
> people
> >>>>> that make your search engine work (even if they stay they could also
> be
> >>>>> doing something more useful to your business).
> >>>>>>
> >>>>>> Charlie
> >>>>>>
> >>>>>>> On 05/07/2022 10:49, Deepak Goel wrote:
> >>>>>>> If you are tearing your hair out on 'Number of Hours' required for
> >>>>> tuning
> >>>>>>> your software, it's time you switch to a better quality performance
> >>>>>>> engineer.
> >>>>>>>
> >>>>>>> Deepak
> >>>>>>> "The greatness of a nation can be judged by the way its animals are
> >>>>> treated
> >>>>>>> - Mahatma Gandhi"
> >>>>>>>
> >>>>>>> +91 73500 12833
> >>>>>>> deicool@gmail.com
> >>>>>>>
> >>>>>>> Facebook:https://www.facebook.com/deicool
> >>>>>>> LinkedIn:www.linkedin.com/in/deicool
> >>>>>>>
> >>>>>>> "Plant a Tree, Go Green"
> >>>>>>>
> >>>>>>> Make In India :http://www.makeinindia.com/home
> >>>>>>>
> >>>>>>>
> >>>>>>> On Tue, Jul 5, 2022 at 3:12 PM Charlie Hull<
> >>>>> chull@opensourceconnections.com>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> Equally it's not a good management practice to burn engineering
> >> hours
> >>>>>>>> trying to optimise performance to avoid spending (often much less)
> >>>>> money
> >>>>>>>> on sufficient hardware to do the job. I've seen this happen many
> >>>> times,
> >>>>>>>> sadly.
> >>>>>>>>
> >>>>>>>> Charlie
> >>>>>>>>
> >>>>>>>> On 05/07/2022 10:33, Deepak Goel wrote:
> >>>>>>>>> Not a good software engineering practice to beef up the hardware
> >>>>> blindly.
> >>>>>>>>> Of Course when you have tuned the software to a point where you
> >>>> can't
> >>>>>>>> tune
> >>>>>>>>> anymore, you can then turn your eyes to hardware.
> >>>>>>>>>
> >>>>>>>>> Deepak
> >>>>>>>>> "The greatness of a nation can be judged by the way its animals
> are
> >>>>>>>> treated
> >>>>>>>>> - Mahatma Gandhi"
> >>>>>>>>>
> >>>>>>>>> +91 73500 12833
> >>>>>>>>> deicool@gmail.com
> >>>>>>>>>
> >>>>>>>>> Facebook:https://www.facebook.com/deicool
> >>>>>>>>> LinkedIn:www.linkedin.com/in/deicool
> >>>>>>>>>
> >>>>>>>>> "Plant a Tree, Go Green"
> >>>>>>>>>
> >>>>>>>>> Make In India :http://www.makeinindia.com/home
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Tue, Jul 5, 2022 at 1:01 AM Dave<hastings.recursive@gmail.com
> >
> >>>>>>>> wrote:
> >>>>>>>>>> Also for $115 I can buy a terabyte of a Samsung ssd, which
> helps a
> >>>>> lot.
> >>>>>>>> It
> >>>>>>>>>> comes to a point where money on hardware will outweigh money on
> >>>>>>>> engineering
> >>>>>>>>>> man power hours, and still come to the same conclusion. As much
> >> ram
> >>>>> as
> >>>>>>>> your
> >>>>>>>>>> rack can take and as big and fast of a raid ssd drive it can
> take.
> >>>>>>>> Remember
> >>>>>>>>>> since solr is always meant to be destroyed and recreated you
> don’t
> >>>>> have
> >>>>>>>> to
> >>>>>>>>>> worry much about hardware failure if you just buy two of
> >> everything
> >>>>> and
> >>>>>>>>>> have a backup server ready and waiting to take over while the
> >>>>> original
> >>>>>>>>>> fails and is reconstructed.
> >>>>>>>>>>
> >>>>>>>>>>> On Jul 4, 2022, at 1:32 PM, Shawn Heisey<ap...@elyograg.org>
> >>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>> On 7/4/22 03:01, Mike wrote:
> >>>>>>>>>>>> My Solr index size is around 500GB and I have 64GB of RAM.
> Solr
> >>>>> eats
> >>>>>>>> up
> >>>>>>>>>> all
> >>>>>>>>>>>> the memory and because of that PHP works very, very slowly.
> What
> >>>>> can I
> >>>>>>>>>> do?
> >>>>>>>>>>> Solr is a Java program.  A Java program will never directly use
> >>>> more
> >>>>>>>>>> memory than you specify for the max heap size.  We cannot make
> any
> >>>>>>>> general
> >>>>>>>>>> recommendations about what heap size you need, because there is
> a
> >>>>> good
> >>>>>>>>>> chance that any recommendation we make would be completely wrong
> >>>> for
> >>>>>>>> your
> >>>>>>>>>> install.  I did see that someone recommended not going above 31G
> >>>> ...
> >>>>> and
> >>>>>>>>>> this is good advice.  At 32 GB, Java switches to 64-bit pointers
> >>>>>>>> instead of
> >>>>>>>>>> 32-bit.  So a heap size of 32 GB actually has LESS memory
> >> available
> >>>>>>>> than a
> >>>>>>>>>> heap size of 31 GB.
> >>>>>>>>>>> The OS will use additional memory beyond the heap for caching
> the
> >>>>> index
> >>>>>>>>>> data, but that is completely outside of Solr's control. Note
> that
> >>>>> 64GB
> >>>>>>>>>> total memory for a 500GB index is almost certainly not enough
> >>>> memory,
> >>>>>>>>>> ESPECIALLY if the same server is used for things other than
> Solr.
> >>>> I
> >>>>>>>> wrote
> >>>>>>>>>> the following wiki page:
> >>>>>>>>
> >>>>>
> >>
> https://cwiki.apache.org/confluence/display/SOLR/SolrPerformanceProblems
> >>>>>>>>>>> Others have recommended that you run Solr on dedicated hardware
> >>>>> that is
> >>>>>>>>>> not used for any other purpose.  I concur with that
> >> recommendation.
> >>>>>>>>>>> Thanks,
> >>>>>>>>>>> Shawn
> >>>>>>>>>>>
> >>>>>>>> --
> >>>>>>>> Charlie Hull - Managing Consultant at OpenSource Connections
> Limited
> >>>>>>>> Founding member of The Search Network<
> >>>> http://www.thesearchnetwork.com>
> >>>>>>>> and co-author of Searching the Enterprise
> >>>>>>>> <
> >>>>>>>>
> >>>>>
> >>>>
> >>
> https://opensourceconnections.com/wp-content/uploads/2020/08/ES_book_final_journal_version.pdf
> >>>>>>>> tel/fax: +44 (0)8700 118334
> >>>>>>>> mobile: +44 (0)7767 825828
> >>>>>>>>
> >>>>>>>> OpenSource Connections Europe GmbH | Pappelallee 78/79 | 10437
> >> Berlin
> >>>>>>>> Amtsgericht Charlottenburg | HRB 230712 B
> >>>>>>>> Geschäftsführer: John M. Woodell | David E. Pugh
> >>>>>>>> Finanzamt: Berlin Finanzamt für Körperschaften II
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> This email has been checked for viruses by AVG.
> >>>>>>>> https://www.avg.com
> >>>>>>>>
> >>>>>> --
> >>>>>> Charlie Hull - Managing Consultant at OpenSource Connections Limited
> >>>>>> Founding member of The Search Network <
> >> http://www.thesearchnetwork.com
> >>>>>
> >>>>> and co-author of Searching the Enterprise <
> >>>>>
> >>>>
> >>
> https://opensourceconnections.com/wp-content/uploads/2020/08/ES_book_final_journal_version.pdf
> >>>>>>
> >>>>>> tel/fax: +44 (0)8700 118334
> >>>>>> mobile: +44 (0)7767 825828
> >>>>>>
> >>>>>> OpenSource Connections Europe GmbH | Pappelallee 78/79 | 10437
> Berlin
> >>>>>> Amtsgericht Charlottenburg | HRB 230712 B
> >>>>>> Geschäftsführer: John M. Woodell | David E. Pugh
> >>>>>> Finanzamt: Berlin Finanzamt für Körperschaften II
> >>>>>>
> >>>>>> --
> >>>>>> This email has been checked for viruses by AVG.
> >>>>>> https://www.avg.com
> >>>>>
> >>>>
> >>
> >
>

Re: Solr eats up all the memory

Posted by Christopher Schultz <ch...@christopherschultz.net>.
Deepak,

On 7/5/22 09:30, Deepak Goel wrote:
> Do you mind telling us which java version are you using?

It doesn't matter. Xms == Xmx is always a good idea for server 
processes. For running Keystore Explorer, let the JVM manage the heap 
size fluctuations. For servers, tell the JVM what you want.

-chris

> On Tue, 5 Jul 2022, 17:55 Dave, <ha...@gmail.com> wrote:
> 
>> Also a good rule of thumb I found is set your xmx and xms, maximum and
>> minimum memory for the heap to be exactly the same, you don’t want Java to
>> try to figure it out,
>>
>>> On Jul 5, 2022, at 7:52 AM, Ritvik Sharma <ri...@gmail.com> wrote:
>>>
>>> Just Add java memory parameters in solr config which should not be more
>>> than 75% of total RAM. and use G1GC.
>>>
>>>
>>>> On Tue, 5 Jul 2022 at 4:59 PM, Deepak Goel <de...@gmail.com> wrote:
>>>>
>>>> I would also suggest you look at which GC mechanism you use. Increasing
>> RAM
>>>> and Heap-Size might result in the application freezed for a long time
>>>> (during GC).
>>>>
>>>> Deepak
>>>> "The greatness of a nation can be judged by the way its animals are
>> treated
>>>> - Mahatma Gandhi"
>>>>
>>>> +91 73500 12833
>>>> deicool@gmail.com
>>>>
>>>> Facebook: https://www.facebook.com/deicool
>>>> LinkedIn: www.linkedin.com/in/deicool
>>>>
>>>> "Plant a Tree, Go Green"
>>>>
>>>> Make In India : http://www.makeinindia.com/home
>>>>
>>>>
>>>>> On Tue, Jul 5, 2022 at 4:43 PM Dave <ha...@gmail.com>
>> wrote:
>>>>>
>>>>> Exactly. You could have the best engineer on your continent but the end
>>>>> result is the same, more metal.  I could build a very fast search
>> server
>>>>> for less than a week of my salary so what’s the point of wasting two
>>>> weeks
>>>>> trying to solve a problem when the solution is literally just right
>>>> there,
>>>>> a big ssd and a lot of memory, then I could work on things that are
>>>>> actually important rather than try to squeeze blood from a turnip.
>>>>>
>>>>>> On Jul 5, 2022, at 6:11 AM, Charlie Hull <
>>>>> chull@opensourceconnections.com> wrote:
>>>>>>
>>>>>> I think you're missing my point.
>>>>>>
>>>>>> Good engineers, even mediocre ones, are expensive, and the great ones
>>>>> are rare. It's a tedious task chasing tiny performance gains when you
>>>> know
>>>>> you're limited by the hardware and a bored engineer might just go and
>>>> look
>>>>> for another job. So if you fail to realise that a capital expense for
>>>>> hardware or hosting is necessary, you run the risk of losing the people
>>>>> that make your search engine work (even if they stay they could also be
>>>>> doing something more useful to your business).
>>>>>>
>>>>>> Charlie
>>>>>>
>>>>>>> On 05/07/2022 10:49, Deepak Goel wrote:
>>>>>>> If you are tearing your hair out on 'Number of Hours' required for
>>>>> tuning
>>>>>>> your software, it's time you switch to a better quality performance
>>>>>>> engineer.
>>>>>>>
>>>>>>> Deepak
>>>>>>> "The greatness of a nation can be judged by the way its animals are
>>>>> treated
>>>>>>> - Mahatma Gandhi"
>>>>>>>
>>>>>>> +91 73500 12833
>>>>>>> deicool@gmail.com
>>>>>>>
>>>>>>> Facebook:https://www.facebook.com/deicool
>>>>>>> LinkedIn:www.linkedin.com/in/deicool
>>>>>>>
>>>>>>> "Plant a Tree, Go Green"
>>>>>>>
>>>>>>> Make In India :http://www.makeinindia.com/home
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Jul 5, 2022 at 3:12 PM Charlie Hull<
>>>>> chull@opensourceconnections.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Equally it's not a good management practice to burn engineering
>> hours
>>>>>>>> trying to optimise performance to avoid spending (often much less)
>>>>> money
>>>>>>>> on sufficient hardware to do the job. I've seen this happen many
>>>> times,
>>>>>>>> sadly.
>>>>>>>>
>>>>>>>> Charlie
>>>>>>>>
>>>>>>>> On 05/07/2022 10:33, Deepak Goel wrote:
>>>>>>>>> Not a good software engineering practice to beef up the hardware
>>>>> blindly.
>>>>>>>>> Of Course when you have tuned the software to a point where you
>>>> can't
>>>>>>>> tune
>>>>>>>>> anymore, you can then turn your eyes to hardware.
>>>>>>>>>
>>>>>>>>> Deepak
>>>>>>>>> "The greatness of a nation can be judged by the way its animals are
>>>>>>>> treated
>>>>>>>>> - Mahatma Gandhi"
>>>>>>>>>
>>>>>>>>> +91 73500 12833
>>>>>>>>> deicool@gmail.com
>>>>>>>>>
>>>>>>>>> Facebook:https://www.facebook.com/deicool
>>>>>>>>> LinkedIn:www.linkedin.com/in/deicool
>>>>>>>>>
>>>>>>>>> "Plant a Tree, Go Green"
>>>>>>>>>
>>>>>>>>> Make In India :http://www.makeinindia.com/home
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, Jul 5, 2022 at 1:01 AM Dave<ha...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>>> Also for $115 I can buy a terabyte of a Samsung ssd, which helps a
>>>>> lot.
>>>>>>>> It
>>>>>>>>>> comes to a point where money on hardware will outweigh money on
>>>>>>>> engineering
>>>>>>>>>> man power hours, and still come to the same conclusion. As much
>> ram
>>>>> as
>>>>>>>> your
>>>>>>>>>> rack can take and as big and fast of a raid ssd drive it can take.
>>>>>>>> Remember
>>>>>>>>>> since solr is always meant to be destroyed and recreated you don’t
>>>>> have
>>>>>>>> to
>>>>>>>>>> worry much about hardware failure if you just buy two of
>> everything
>>>>> and
>>>>>>>>>> have a backup server ready and waiting to take over while the
>>>>> original
>>>>>>>>>> fails and is reconstructed.
>>>>>>>>>>
>>>>>>>>>>> On Jul 4, 2022, at 1:32 PM, Shawn Heisey<ap...@elyograg.org>
>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> On 7/4/22 03:01, Mike wrote:
>>>>>>>>>>>> My Solr index size is around 500GB and I have 64GB of RAM. Solr
>>>>> eats
>>>>>>>> up
>>>>>>>>>> all
>>>>>>>>>>>> the memory and because of that PHP works very, very slowly. What
>>>>> can I
>>>>>>>>>> do?
>>>>>>>>>>> Solr is a Java program.  A Java program will never directly use
>>>> more
>>>>>>>>>> memory than you specify for the max heap size.  We cannot make any
>>>>>>>> general
>>>>>>>>>> recommendations about what heap size you need, because there is a
>>>>> good
>>>>>>>>>> chance that any recommendation we make would be completely wrong
>>>> for
>>>>>>>> your
>>>>>>>>>> install.  I did see that someone recommended not going above 31G
>>>> ...
>>>>> and
>>>>>>>>>> this is good advice.  At 32 GB, Java switches to 64-bit pointers
>>>>>>>> instead of
>>>>>>>>>> 32-bit.  So a heap size of 32 GB actually has LESS memory
>> available
>>>>>>>> than a
>>>>>>>>>> heap size of 31 GB.
>>>>>>>>>>> The OS will use additional memory beyond the heap for caching the
>>>>> index
>>>>>>>>>> data, but that is completely outside of Solr's control. Note that
>>>>> 64GB
>>>>>>>>>> total memory for a 500GB index is almost certainly not enough
>>>> memory,
>>>>>>>>>> ESPECIALLY if the same server is used for things other than Solr.
>>>> I
>>>>>>>> wrote
>>>>>>>>>> the following wiki page:
>>>>>>>>
>>>>>
>> https://cwiki.apache.org/confluence/display/SOLR/SolrPerformanceProblems
>>>>>>>>>>> Others have recommended that you run Solr on dedicated hardware
>>>>> that is
>>>>>>>>>> not used for any other purpose.  I concur with that
>> recommendation.
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Shawn
>>>>>>>>>>>
>>>>>>>> --
>>>>>>>> Charlie Hull - Managing Consultant at OpenSource Connections Limited
>>>>>>>> Founding member of The Search Network<
>>>> http://www.thesearchnetwork.com>
>>>>>>>> and co-author of Searching the Enterprise
>>>>>>>> <
>>>>>>>>
>>>>>
>>>>
>> https://opensourceconnections.com/wp-content/uploads/2020/08/ES_book_final_journal_version.pdf
>>>>>>>> tel/fax: +44 (0)8700 118334
>>>>>>>> mobile: +44 (0)7767 825828
>>>>>>>>
>>>>>>>> OpenSource Connections Europe GmbH | Pappelallee 78/79 | 10437
>> Berlin
>>>>>>>> Amtsgericht Charlottenburg | HRB 230712 B
>>>>>>>> Geschäftsführer: John M. Woodell | David E. Pugh
>>>>>>>> Finanzamt: Berlin Finanzamt für Körperschaften II
>>>>>>>>
>>>>>>>> --
>>>>>>>> This email has been checked for viruses by AVG.
>>>>>>>> https://www.avg.com
>>>>>>>>
>>>>>> --
>>>>>> Charlie Hull - Managing Consultant at OpenSource Connections Limited
>>>>>> Founding member of The Search Network <
>> http://www.thesearchnetwork.com
>>>>>
>>>>> and co-author of Searching the Enterprise <
>>>>>
>>>>
>> https://opensourceconnections.com/wp-content/uploads/2020/08/ES_book_final_journal_version.pdf
>>>>>>
>>>>>> tel/fax: +44 (0)8700 118334
>>>>>> mobile: +44 (0)7767 825828
>>>>>>
>>>>>> OpenSource Connections Europe GmbH | Pappelallee 78/79 | 10437 Berlin
>>>>>> Amtsgericht Charlottenburg | HRB 230712 B
>>>>>> Geschäftsführer: John M. Woodell | David E. Pugh
>>>>>> Finanzamt: Berlin Finanzamt für Körperschaften II
>>>>>>
>>>>>> --
>>>>>> This email has been checked for viruses by AVG.
>>>>>> https://www.avg.com
>>>>>
>>>>
>>
> 

Re: Solr eats up all the memory

Posted by Deepak Goel <de...@gmail.com>.
Do you mind telling us which java version are you using?

On Tue, 5 Jul 2022, 17:55 Dave, <ha...@gmail.com> wrote:

> Also a good rule of thumb I found is set your xmx and xms, maximum and
> minimum memory for the heap to be exactly the same, you don’t want Java to
> try to figure it out,
>
> > On Jul 5, 2022, at 7:52 AM, Ritvik Sharma <ri...@gmail.com> wrote:
> >
> > Just Add java memory parameters in solr config which should not be more
> > than 75% of total RAM. and use G1GC.
> >
> >
> >> On Tue, 5 Jul 2022 at 4:59 PM, Deepak Goel <de...@gmail.com> wrote:
> >>
> >> I would also suggest you look at which GC mechanism you use. Increasing
> RAM
> >> and Heap-Size might result in the application freezed for a long time
> >> (during GC).
> >>
> >> Deepak
> >> "The greatness of a nation can be judged by the way its animals are
> treated
> >> - Mahatma Gandhi"
> >>
> >> +91 73500 12833
> >> deicool@gmail.com
> >>
> >> Facebook: https://www.facebook.com/deicool
> >> LinkedIn: www.linkedin.com/in/deicool
> >>
> >> "Plant a Tree, Go Green"
> >>
> >> Make In India : http://www.makeinindia.com/home
> >>
> >>
> >>> On Tue, Jul 5, 2022 at 4:43 PM Dave <ha...@gmail.com>
> wrote:
> >>>
> >>> Exactly. You could have the best engineer on your continent but the end
> >>> result is the same, more metal.  I could build a very fast search
> server
> >>> for less than a week of my salary so what’s the point of wasting two
> >> weeks
> >>> trying to solve a problem when the solution is literally just right
> >> there,
> >>> a big ssd and a lot of memory, then I could work on things that are
> >>> actually important rather than try to squeeze blood from a turnip.
> >>>
> >>>> On Jul 5, 2022, at 6:11 AM, Charlie Hull <
> >>> chull@opensourceconnections.com> wrote:
> >>>>
> >>>> I think you're missing my point.
> >>>>
> >>>> Good engineers, even mediocre ones, are expensive, and the great ones
> >>> are rare. It's a tedious task chasing tiny performance gains when you
> >> know
> >>> you're limited by the hardware and a bored engineer might just go and
> >> look
> >>> for another job. So if you fail to realise that a capital expense for
> >>> hardware or hosting is necessary, you run the risk of losing the people
> >>> that make your search engine work (even if they stay they could also be
> >>> doing something more useful to your business).
> >>>>
> >>>> Charlie
> >>>>
> >>>>> On 05/07/2022 10:49, Deepak Goel wrote:
> >>>>> If you are tearing your hair out on 'Number of Hours' required for
> >>> tuning
> >>>>> your software, it's time you switch to a better quality performance
> >>>>> engineer.
> >>>>>
> >>>>> Deepak
> >>>>> "The greatness of a nation can be judged by the way its animals are
> >>> treated
> >>>>> - Mahatma Gandhi"
> >>>>>
> >>>>> +91 73500 12833
> >>>>> deicool@gmail.com
> >>>>>
> >>>>> Facebook:https://www.facebook.com/deicool
> >>>>> LinkedIn:www.linkedin.com/in/deicool
> >>>>>
> >>>>> "Plant a Tree, Go Green"
> >>>>>
> >>>>> Make In India :http://www.makeinindia.com/home
> >>>>>
> >>>>>
> >>>>> On Tue, Jul 5, 2022 at 3:12 PM Charlie Hull<
> >>> chull@opensourceconnections.com>
> >>>>> wrote:
> >>>>>
> >>>>>> Equally it's not a good management practice to burn engineering
> hours
> >>>>>> trying to optimise performance to avoid spending (often much less)
> >>> money
> >>>>>> on sufficient hardware to do the job. I've seen this happen many
> >> times,
> >>>>>> sadly.
> >>>>>>
> >>>>>> Charlie
> >>>>>>
> >>>>>> On 05/07/2022 10:33, Deepak Goel wrote:
> >>>>>>> Not a good software engineering practice to beef up the hardware
> >>> blindly.
> >>>>>>> Of Course when you have tuned the software to a point where you
> >> can't
> >>>>>> tune
> >>>>>>> anymore, you can then turn your eyes to hardware.
> >>>>>>>
> >>>>>>> Deepak
> >>>>>>> "The greatness of a nation can be judged by the way its animals are
> >>>>>> treated
> >>>>>>> - Mahatma Gandhi"
> >>>>>>>
> >>>>>>> +91 73500 12833
> >>>>>>> deicool@gmail.com
> >>>>>>>
> >>>>>>> Facebook:https://www.facebook.com/deicool
> >>>>>>> LinkedIn:www.linkedin.com/in/deicool
> >>>>>>>
> >>>>>>> "Plant a Tree, Go Green"
> >>>>>>>
> >>>>>>> Make In India :http://www.makeinindia.com/home
> >>>>>>>
> >>>>>>>
> >>>>>>> On Tue, Jul 5, 2022 at 1:01 AM Dave<ha...@gmail.com>
> >>>>>> wrote:
> >>>>>>>> Also for $115 I can buy a terabyte of a Samsung ssd, which helps a
> >>> lot.
> >>>>>> It
> >>>>>>>> comes to a point where money on hardware will outweigh money on
> >>>>>> engineering
> >>>>>>>> man power hours, and still come to the same conclusion. As much
> ram
> >>> as
> >>>>>> your
> >>>>>>>> rack can take and as big and fast of a raid ssd drive it can take.
> >>>>>> Remember
> >>>>>>>> since solr is always meant to be destroyed and recreated you don’t
> >>> have
> >>>>>> to
> >>>>>>>> worry much about hardware failure if you just buy two of
> everything
> >>> and
> >>>>>>>> have a backup server ready and waiting to take over while the
> >>> original
> >>>>>>>> fails and is reconstructed.
> >>>>>>>>
> >>>>>>>>> On Jul 4, 2022, at 1:32 PM, Shawn Heisey<ap...@elyograg.org>
> >>> wrote:
> >>>>>>>>>
> >>>>>>>>> On 7/4/22 03:01, Mike wrote:
> >>>>>>>>>> My Solr index size is around 500GB and I have 64GB of RAM. Solr
> >>> eats
> >>>>>> up
> >>>>>>>> all
> >>>>>>>>>> the memory and because of that PHP works very, very slowly. What
> >>> can I
> >>>>>>>> do?
> >>>>>>>>> Solr is a Java program.  A Java program will never directly use
> >> more
> >>>>>>>> memory than you specify for the max heap size.  We cannot make any
> >>>>>> general
> >>>>>>>> recommendations about what heap size you need, because there is a
> >>> good
> >>>>>>>> chance that any recommendation we make would be completely wrong
> >> for
> >>>>>> your
> >>>>>>>> install.  I did see that someone recommended not going above 31G
> >> ...
> >>> and
> >>>>>>>> this is good advice.  At 32 GB, Java switches to 64-bit pointers
> >>>>>> instead of
> >>>>>>>> 32-bit.  So a heap size of 32 GB actually has LESS memory
> available
> >>>>>> than a
> >>>>>>>> heap size of 31 GB.
> >>>>>>>>> The OS will use additional memory beyond the heap for caching the
> >>> index
> >>>>>>>> data, but that is completely outside of Solr's control. Note that
> >>> 64GB
> >>>>>>>> total memory for a 500GB index is almost certainly not enough
> >> memory,
> >>>>>>>> ESPECIALLY if the same server is used for things other than Solr.
> >> I
> >>>>>> wrote
> >>>>>>>> the following wiki page:
> >>>>>>
> >>>
> https://cwiki.apache.org/confluence/display/SOLR/SolrPerformanceProblems
> >>>>>>>>> Others have recommended that you run Solr on dedicated hardware
> >>> that is
> >>>>>>>> not used for any other purpose.  I concur with that
> recommendation.
> >>>>>>>>> Thanks,
> >>>>>>>>> Shawn
> >>>>>>>>>
> >>>>>> --
> >>>>>> Charlie Hull - Managing Consultant at OpenSource Connections Limited
> >>>>>> Founding member of The Search Network<
> >> http://www.thesearchnetwork.com>
> >>>>>> and co-author of Searching the Enterprise
> >>>>>> <
> >>>>>>
> >>>
> >>
> https://opensourceconnections.com/wp-content/uploads/2020/08/ES_book_final_journal_version.pdf
> >>>>>> tel/fax: +44 (0)8700 118334
> >>>>>> mobile: +44 (0)7767 825828
> >>>>>>
> >>>>>> OpenSource Connections Europe GmbH | Pappelallee 78/79 | 10437
> Berlin
> >>>>>> Amtsgericht Charlottenburg | HRB 230712 B
> >>>>>> Geschäftsführer: John M. Woodell | David E. Pugh
> >>>>>> Finanzamt: Berlin Finanzamt für Körperschaften II
> >>>>>>
> >>>>>> --
> >>>>>> This email has been checked for viruses by AVG.
> >>>>>> https://www.avg.com
> >>>>>>
> >>>> --
> >>>> Charlie Hull - Managing Consultant at OpenSource Connections Limited
> >>>> Founding member of The Search Network <
> http://www.thesearchnetwork.com
> >>>
> >>> and co-author of Searching the Enterprise <
> >>>
> >>
> https://opensourceconnections.com/wp-content/uploads/2020/08/ES_book_final_journal_version.pdf
> >>>>
> >>>> tel/fax: +44 (0)8700 118334
> >>>> mobile: +44 (0)7767 825828
> >>>>
> >>>> OpenSource Connections Europe GmbH | Pappelallee 78/79 | 10437 Berlin
> >>>> Amtsgericht Charlottenburg | HRB 230712 B
> >>>> Geschäftsführer: John M. Woodell | David E. Pugh
> >>>> Finanzamt: Berlin Finanzamt für Körperschaften II
> >>>>
> >>>> --
> >>>> This email has been checked for viruses by AVG.
> >>>> https://www.avg.com
> >>>
> >>
>

Re: Solr eats up all the memory

Posted by Dave <ha...@gmail.com>.
Also a good rule of thumb I found is set your xmx and xms, maximum and minimum memory for the heap to be exactly the same, you don’t want Java to try to figure it out, 

> On Jul 5, 2022, at 7:52 AM, Ritvik Sharma <ri...@gmail.com> wrote:
> 
> Just Add java memory parameters in solr config which should not be more
> than 75% of total RAM. and use G1GC.
> 
> 
>> On Tue, 5 Jul 2022 at 4:59 PM, Deepak Goel <de...@gmail.com> wrote:
>> 
>> I would also suggest you look at which GC mechanism you use. Increasing RAM
>> and Heap-Size might result in the application freezed for a long time
>> (during GC).
>> 
>> Deepak
>> "The greatness of a nation can be judged by the way its animals are treated
>> - Mahatma Gandhi"
>> 
>> +91 73500 12833
>> deicool@gmail.com
>> 
>> Facebook: https://www.facebook.com/deicool
>> LinkedIn: www.linkedin.com/in/deicool
>> 
>> "Plant a Tree, Go Green"
>> 
>> Make In India : http://www.makeinindia.com/home
>> 
>> 
>>> On Tue, Jul 5, 2022 at 4:43 PM Dave <ha...@gmail.com> wrote:
>>> 
>>> Exactly. You could have the best engineer on your continent but the end
>>> result is the same, more metal.  I could build a very fast search server
>>> for less than a week of my salary so what’s the point of wasting two
>> weeks
>>> trying to solve a problem when the solution is literally just right
>> there,
>>> a big ssd and a lot of memory, then I could work on things that are
>>> actually important rather than try to squeeze blood from a turnip.
>>> 
>>>> On Jul 5, 2022, at 6:11 AM, Charlie Hull <
>>> chull@opensourceconnections.com> wrote:
>>>> 
>>>> I think you're missing my point.
>>>> 
>>>> Good engineers, even mediocre ones, are expensive, and the great ones
>>> are rare. It's a tedious task chasing tiny performance gains when you
>> know
>>> you're limited by the hardware and a bored engineer might just go and
>> look
>>> for another job. So if you fail to realise that a capital expense for
>>> hardware or hosting is necessary, you run the risk of losing the people
>>> that make your search engine work (even if they stay they could also be
>>> doing something more useful to your business).
>>>> 
>>>> Charlie
>>>> 
>>>>> On 05/07/2022 10:49, Deepak Goel wrote:
>>>>> If you are tearing your hair out on 'Number of Hours' required for
>>> tuning
>>>>> your software, it's time you switch to a better quality performance
>>>>> engineer.
>>>>> 
>>>>> Deepak
>>>>> "The greatness of a nation can be judged by the way its animals are
>>> treated
>>>>> - Mahatma Gandhi"
>>>>> 
>>>>> +91 73500 12833
>>>>> deicool@gmail.com
>>>>> 
>>>>> Facebook:https://www.facebook.com/deicool
>>>>> LinkedIn:www.linkedin.com/in/deicool
>>>>> 
>>>>> "Plant a Tree, Go Green"
>>>>> 
>>>>> Make In India :http://www.makeinindia.com/home
>>>>> 
>>>>> 
>>>>> On Tue, Jul 5, 2022 at 3:12 PM Charlie Hull<
>>> chull@opensourceconnections.com>
>>>>> wrote:
>>>>> 
>>>>>> Equally it's not a good management practice to burn engineering hours
>>>>>> trying to optimise performance to avoid spending (often much less)
>>> money
>>>>>> on sufficient hardware to do the job. I've seen this happen many
>> times,
>>>>>> sadly.
>>>>>> 
>>>>>> Charlie
>>>>>> 
>>>>>> On 05/07/2022 10:33, Deepak Goel wrote:
>>>>>>> Not a good software engineering practice to beef up the hardware
>>> blindly.
>>>>>>> Of Course when you have tuned the software to a point where you
>> can't
>>>>>> tune
>>>>>>> anymore, you can then turn your eyes to hardware.
>>>>>>> 
>>>>>>> Deepak
>>>>>>> "The greatness of a nation can be judged by the way its animals are
>>>>>> treated
>>>>>>> - Mahatma Gandhi"
>>>>>>> 
>>>>>>> +91 73500 12833
>>>>>>> deicool@gmail.com
>>>>>>> 
>>>>>>> Facebook:https://www.facebook.com/deicool
>>>>>>> LinkedIn:www.linkedin.com/in/deicool
>>>>>>> 
>>>>>>> "Plant a Tree, Go Green"
>>>>>>> 
>>>>>>> Make In India :http://www.makeinindia.com/home
>>>>>>> 
>>>>>>> 
>>>>>>> On Tue, Jul 5, 2022 at 1:01 AM Dave<ha...@gmail.com>
>>>>>> wrote:
>>>>>>>> Also for $115 I can buy a terabyte of a Samsung ssd, which helps a
>>> lot.
>>>>>> It
>>>>>>>> comes to a point where money on hardware will outweigh money on
>>>>>> engineering
>>>>>>>> man power hours, and still come to the same conclusion. As much ram
>>> as
>>>>>> your
>>>>>>>> rack can take and as big and fast of a raid ssd drive it can take.
>>>>>> Remember
>>>>>>>> since solr is always meant to be destroyed and recreated you don’t
>>> have
>>>>>> to
>>>>>>>> worry much about hardware failure if you just buy two of everything
>>> and
>>>>>>>> have a backup server ready and waiting to take over while the
>>> original
>>>>>>>> fails and is reconstructed.
>>>>>>>> 
>>>>>>>>> On Jul 4, 2022, at 1:32 PM, Shawn Heisey<ap...@elyograg.org>
>>> wrote:
>>>>>>>>> 
>>>>>>>>> On 7/4/22 03:01, Mike wrote:
>>>>>>>>>> My Solr index size is around 500GB and I have 64GB of RAM. Solr
>>> eats
>>>>>> up
>>>>>>>> all
>>>>>>>>>> the memory and because of that PHP works very, very slowly. What
>>> can I
>>>>>>>> do?
>>>>>>>>> Solr is a Java program.  A Java program will never directly use
>> more
>>>>>>>> memory than you specify for the max heap size.  We cannot make any
>>>>>> general
>>>>>>>> recommendations about what heap size you need, because there is a
>>> good
>>>>>>>> chance that any recommendation we make would be completely wrong
>> for
>>>>>> your
>>>>>>>> install.  I did see that someone recommended not going above 31G
>> ...
>>> and
>>>>>>>> this is good advice.  At 32 GB, Java switches to 64-bit pointers
>>>>>> instead of
>>>>>>>> 32-bit.  So a heap size of 32 GB actually has LESS memory available
>>>>>> than a
>>>>>>>> heap size of 31 GB.
>>>>>>>>> The OS will use additional memory beyond the heap for caching the
>>> index
>>>>>>>> data, but that is completely outside of Solr's control. Note that
>>> 64GB
>>>>>>>> total memory for a 500GB index is almost certainly not enough
>> memory,
>>>>>>>> ESPECIALLY if the same server is used for things other than Solr.
>> I
>>>>>> wrote
>>>>>>>> the following wiki page:
>>>>>> 
>>> https://cwiki.apache.org/confluence/display/SOLR/SolrPerformanceProblems
>>>>>>>>> Others have recommended that you run Solr on dedicated hardware
>>> that is
>>>>>>>> not used for any other purpose.  I concur with that recommendation.
>>>>>>>>> Thanks,
>>>>>>>>> Shawn
>>>>>>>>> 
>>>>>> --
>>>>>> Charlie Hull - Managing Consultant at OpenSource Connections Limited
>>>>>> Founding member of The Search Network<
>> http://www.thesearchnetwork.com>
>>>>>> and co-author of Searching the Enterprise
>>>>>> <
>>>>>> 
>>> 
>> https://opensourceconnections.com/wp-content/uploads/2020/08/ES_book_final_journal_version.pdf
>>>>>> tel/fax: +44 (0)8700 118334
>>>>>> mobile: +44 (0)7767 825828
>>>>>> 
>>>>>> OpenSource Connections Europe GmbH | Pappelallee 78/79 | 10437 Berlin
>>>>>> Amtsgericht Charlottenburg | HRB 230712 B
>>>>>> Geschäftsführer: John M. Woodell | David E. Pugh
>>>>>> Finanzamt: Berlin Finanzamt für Körperschaften II
>>>>>> 
>>>>>> --
>>>>>> This email has been checked for viruses by AVG.
>>>>>> https://www.avg.com
>>>>>> 
>>>> --
>>>> Charlie Hull - Managing Consultant at OpenSource Connections Limited
>>>> Founding member of The Search Network <http://www.thesearchnetwork.com
>>> 
>>> and co-author of Searching the Enterprise <
>>> 
>> https://opensourceconnections.com/wp-content/uploads/2020/08/ES_book_final_journal_version.pdf
>>>> 
>>>> tel/fax: +44 (0)8700 118334
>>>> mobile: +44 (0)7767 825828
>>>> 
>>>> OpenSource Connections Europe GmbH | Pappelallee 78/79 | 10437 Berlin
>>>> Amtsgericht Charlottenburg | HRB 230712 B
>>>> Geschäftsführer: John M. Woodell | David E. Pugh
>>>> Finanzamt: Berlin Finanzamt für Körperschaften II
>>>> 
>>>> --
>>>> This email has been checked for viruses by AVG.
>>>> https://www.avg.com
>>> 
>> 

Re: Solr eats up all the memory

Posted by Ritvik Sharma <ri...@gmail.com>.
Just Add java memory parameters in solr config which should not be more
than 75% of total RAM. and use G1GC.


On Tue, 5 Jul 2022 at 4:59 PM, Deepak Goel <de...@gmail.com> wrote:

> I would also suggest you look at which GC mechanism you use. Increasing RAM
> and Heap-Size might result in the application freezed for a long time
> (during GC).
>
> Deepak
> "The greatness of a nation can be judged by the way its animals are treated
> - Mahatma Gandhi"
>
> +91 73500 12833
> deicool@gmail.com
>
> Facebook: https://www.facebook.com/deicool
> LinkedIn: www.linkedin.com/in/deicool
>
> "Plant a Tree, Go Green"
>
> Make In India : http://www.makeinindia.com/home
>
>
> On Tue, Jul 5, 2022 at 4:43 PM Dave <ha...@gmail.com> wrote:
>
> > Exactly. You could have the best engineer on your continent but the end
> > result is the same, more metal.  I could build a very fast search server
> > for less than a week of my salary so what’s the point of wasting two
> weeks
> > trying to solve a problem when the solution is literally just right
> there,
> > a big ssd and a lot of memory, then I could work on things that are
> > actually important rather than try to squeeze blood from a turnip.
> >
> > > On Jul 5, 2022, at 6:11 AM, Charlie Hull <
> > chull@opensourceconnections.com> wrote:
> > >
> > > I think you're missing my point.
> > >
> > > Good engineers, even mediocre ones, are expensive, and the great ones
> > are rare. It's a tedious task chasing tiny performance gains when you
> know
> > you're limited by the hardware and a bored engineer might just go and
> look
> > for another job. So if you fail to realise that a capital expense for
> > hardware or hosting is necessary, you run the risk of losing the people
> > that make your search engine work (even if they stay they could also be
> > doing something more useful to your business).
> > >
> > > Charlie
> > >
> > >> On 05/07/2022 10:49, Deepak Goel wrote:
> > >> If you are tearing your hair out on 'Number of Hours' required for
> > tuning
> > >> your software, it's time you switch to a better quality performance
> > >> engineer.
> > >>
> > >> Deepak
> > >> "The greatness of a nation can be judged by the way its animals are
> > treated
> > >> - Mahatma Gandhi"
> > >>
> > >> +91 73500 12833
> > >> deicool@gmail.com
> > >>
> > >> Facebook:https://www.facebook.com/deicool
> > >> LinkedIn:www.linkedin.com/in/deicool
> > >>
> > >> "Plant a Tree, Go Green"
> > >>
> > >> Make In India :http://www.makeinindia.com/home
> > >>
> > >>
> > >> On Tue, Jul 5, 2022 at 3:12 PM Charlie Hull<
> > chull@opensourceconnections.com>
> > >> wrote:
> > >>
> > >>> Equally it's not a good management practice to burn engineering hours
> > >>> trying to optimise performance to avoid spending (often much less)
> > money
> > >>> on sufficient hardware to do the job. I've seen this happen many
> times,
> > >>> sadly.
> > >>>
> > >>> Charlie
> > >>>
> > >>> On 05/07/2022 10:33, Deepak Goel wrote:
> > >>>> Not a good software engineering practice to beef up the hardware
> > blindly.
> > >>>> Of Course when you have tuned the software to a point where you
> can't
> > >>> tune
> > >>>> anymore, you can then turn your eyes to hardware.
> > >>>>
> > >>>> Deepak
> > >>>> "The greatness of a nation can be judged by the way its animals are
> > >>> treated
> > >>>> - Mahatma Gandhi"
> > >>>>
> > >>>> +91 73500 12833
> > >>>> deicool@gmail.com
> > >>>>
> > >>>> Facebook:https://www.facebook.com/deicool
> > >>>> LinkedIn:www.linkedin.com/in/deicool
> > >>>>
> > >>>> "Plant a Tree, Go Green"
> > >>>>
> > >>>> Make In India :http://www.makeinindia.com/home
> > >>>>
> > >>>>
> > >>>> On Tue, Jul 5, 2022 at 1:01 AM Dave<ha...@gmail.com>
> > >>> wrote:
> > >>>>> Also for $115 I can buy a terabyte of a Samsung ssd, which helps a
> > lot.
> > >>> It
> > >>>>> comes to a point where money on hardware will outweigh money on
> > >>> engineering
> > >>>>> man power hours, and still come to the same conclusion. As much ram
> > as
> > >>> your
> > >>>>> rack can take and as big and fast of a raid ssd drive it can take.
> > >>> Remember
> > >>>>> since solr is always meant to be destroyed and recreated you don’t
> > have
> > >>> to
> > >>>>> worry much about hardware failure if you just buy two of everything
> > and
> > >>>>> have a backup server ready and waiting to take over while the
> > original
> > >>>>> fails and is reconstructed.
> > >>>>>
> > >>>>>> On Jul 4, 2022, at 1:32 PM, Shawn Heisey<ap...@elyograg.org>
> >  wrote:
> > >>>>>>
> > >>>>>> On 7/4/22 03:01, Mike wrote:
> > >>>>>>> My Solr index size is around 500GB and I have 64GB of RAM. Solr
> > eats
> > >>> up
> > >>>>> all
> > >>>>>>> the memory and because of that PHP works very, very slowly. What
> > can I
> > >>>>> do?
> > >>>>>> Solr is a Java program.  A Java program will never directly use
> more
> > >>>>> memory than you specify for the max heap size.  We cannot make any
> > >>> general
> > >>>>> recommendations about what heap size you need, because there is a
> > good
> > >>>>> chance that any recommendation we make would be completely wrong
> for
> > >>> your
> > >>>>> install.  I did see that someone recommended not going above 31G
> ...
> > and
> > >>>>> this is good advice.  At 32 GB, Java switches to 64-bit pointers
> > >>> instead of
> > >>>>> 32-bit.  So a heap size of 32 GB actually has LESS memory available
> > >>> than a
> > >>>>> heap size of 31 GB.
> > >>>>>> The OS will use additional memory beyond the heap for caching the
> > index
> > >>>>> data, but that is completely outside of Solr's control. Note that
> > 64GB
> > >>>>> total memory for a 500GB index is almost certainly not enough
> memory,
> > >>>>> ESPECIALLY if the same server is used for things other than Solr.
> I
> > >>> wrote
> > >>>>> the following wiki page:
> > >>>
> > https://cwiki.apache.org/confluence/display/SOLR/SolrPerformanceProblems
> > >>>>>> Others have recommended that you run Solr on dedicated hardware
> > that is
> > >>>>> not used for any other purpose.  I concur with that recommendation.
> > >>>>>> Thanks,
> > >>>>>> Shawn
> > >>>>>>
> > >>> --
> > >>> Charlie Hull - Managing Consultant at OpenSource Connections Limited
> > >>> Founding member of The Search Network<
> http://www.thesearchnetwork.com>
> > >>> and co-author of Searching the Enterprise
> > >>> <
> > >>>
> >
> https://opensourceconnections.com/wp-content/uploads/2020/08/ES_book_final_journal_version.pdf
> > >>> tel/fax: +44 (0)8700 118334
> > >>> mobile: +44 (0)7767 825828
> > >>>
> > >>> OpenSource Connections Europe GmbH | Pappelallee 78/79 | 10437 Berlin
> > >>> Amtsgericht Charlottenburg | HRB 230712 B
> > >>> Geschäftsführer: John M. Woodell | David E. Pugh
> > >>> Finanzamt: Berlin Finanzamt für Körperschaften II
> > >>>
> > >>> --
> > >>> This email has been checked for viruses by AVG.
> > >>> https://www.avg.com
> > >>>
> > > --
> > > Charlie Hull - Managing Consultant at OpenSource Connections Limited
> > > Founding member of The Search Network <http://www.thesearchnetwork.com
> >
> > and co-author of Searching the Enterprise <
> >
> https://opensourceconnections.com/wp-content/uploads/2020/08/ES_book_final_journal_version.pdf
> > >
> > > tel/fax: +44 (0)8700 118334
> > > mobile: +44 (0)7767 825828
> > >
> > > OpenSource Connections Europe GmbH | Pappelallee 78/79 | 10437 Berlin
> > > Amtsgericht Charlottenburg | HRB 230712 B
> > > Geschäftsführer: John M. Woodell | David E. Pugh
> > > Finanzamt: Berlin Finanzamt für Körperschaften II
> > >
> > > --
> > > This email has been checked for viruses by AVG.
> > > https://www.avg.com
> >
>

Re: Solr eats up all the memory

Posted by Deepak Goel <de...@gmail.com>.
I would also suggest you look at which GC mechanism you use. Increasing RAM
and Heap-Size might result in the application freezed for a long time
(during GC).

Deepak
"The greatness of a nation can be judged by the way its animals are treated
- Mahatma Gandhi"

+91 73500 12833
deicool@gmail.com

Facebook: https://www.facebook.com/deicool
LinkedIn: www.linkedin.com/in/deicool

"Plant a Tree, Go Green"

Make In India : http://www.makeinindia.com/home


On Tue, Jul 5, 2022 at 4:43 PM Dave <ha...@gmail.com> wrote:

> Exactly. You could have the best engineer on your continent but the end
> result is the same, more metal.  I could build a very fast search server
> for less than a week of my salary so what’s the point of wasting two weeks
> trying to solve a problem when the solution is literally just right there,
> a big ssd and a lot of memory, then I could work on things that are
> actually important rather than try to squeeze blood from a turnip.
>
> > On Jul 5, 2022, at 6:11 AM, Charlie Hull <
> chull@opensourceconnections.com> wrote:
> >
> > I think you're missing my point.
> >
> > Good engineers, even mediocre ones, are expensive, and the great ones
> are rare. It's a tedious task chasing tiny performance gains when you know
> you're limited by the hardware and a bored engineer might just go and look
> for another job. So if you fail to realise that a capital expense for
> hardware or hosting is necessary, you run the risk of losing the people
> that make your search engine work (even if they stay they could also be
> doing something more useful to your business).
> >
> > Charlie
> >
> >> On 05/07/2022 10:49, Deepak Goel wrote:
> >> If you are tearing your hair out on 'Number of Hours' required for
> tuning
> >> your software, it's time you switch to a better quality performance
> >> engineer.
> >>
> >> Deepak
> >> "The greatness of a nation can be judged by the way its animals are
> treated
> >> - Mahatma Gandhi"
> >>
> >> +91 73500 12833
> >> deicool@gmail.com
> >>
> >> Facebook:https://www.facebook.com/deicool
> >> LinkedIn:www.linkedin.com/in/deicool
> >>
> >> "Plant a Tree, Go Green"
> >>
> >> Make In India :http://www.makeinindia.com/home
> >>
> >>
> >> On Tue, Jul 5, 2022 at 3:12 PM Charlie Hull<
> chull@opensourceconnections.com>
> >> wrote:
> >>
> >>> Equally it's not a good management practice to burn engineering hours
> >>> trying to optimise performance to avoid spending (often much less)
> money
> >>> on sufficient hardware to do the job. I've seen this happen many times,
> >>> sadly.
> >>>
> >>> Charlie
> >>>
> >>> On 05/07/2022 10:33, Deepak Goel wrote:
> >>>> Not a good software engineering practice to beef up the hardware
> blindly.
> >>>> Of Course when you have tuned the software to a point where you can't
> >>> tune
> >>>> anymore, you can then turn your eyes to hardware.
> >>>>
> >>>> Deepak
> >>>> "The greatness of a nation can be judged by the way its animals are
> >>> treated
> >>>> - Mahatma Gandhi"
> >>>>
> >>>> +91 73500 12833
> >>>> deicool@gmail.com
> >>>>
> >>>> Facebook:https://www.facebook.com/deicool
> >>>> LinkedIn:www.linkedin.com/in/deicool
> >>>>
> >>>> "Plant a Tree, Go Green"
> >>>>
> >>>> Make In India :http://www.makeinindia.com/home
> >>>>
> >>>>
> >>>> On Tue, Jul 5, 2022 at 1:01 AM Dave<ha...@gmail.com>
> >>> wrote:
> >>>>> Also for $115 I can buy a terabyte of a Samsung ssd, which helps a
> lot.
> >>> It
> >>>>> comes to a point where money on hardware will outweigh money on
> >>> engineering
> >>>>> man power hours, and still come to the same conclusion. As much ram
> as
> >>> your
> >>>>> rack can take and as big and fast of a raid ssd drive it can take.
> >>> Remember
> >>>>> since solr is always meant to be destroyed and recreated you don’t
> have
> >>> to
> >>>>> worry much about hardware failure if you just buy two of everything
> and
> >>>>> have a backup server ready and waiting to take over while the
> original
> >>>>> fails and is reconstructed.
> >>>>>
> >>>>>> On Jul 4, 2022, at 1:32 PM, Shawn Heisey<ap...@elyograg.org>
>  wrote:
> >>>>>>
> >>>>>> On 7/4/22 03:01, Mike wrote:
> >>>>>>> My Solr index size is around 500GB and I have 64GB of RAM. Solr
> eats
> >>> up
> >>>>> all
> >>>>>>> the memory and because of that PHP works very, very slowly. What
> can I
> >>>>> do?
> >>>>>> Solr is a Java program.  A Java program will never directly use more
> >>>>> memory than you specify for the max heap size.  We cannot make any
> >>> general
> >>>>> recommendations about what heap size you need, because there is a
> good
> >>>>> chance that any recommendation we make would be completely wrong for
> >>> your
> >>>>> install.  I did see that someone recommended not going above 31G ...
> and
> >>>>> this is good advice.  At 32 GB, Java switches to 64-bit pointers
> >>> instead of
> >>>>> 32-bit.  So a heap size of 32 GB actually has LESS memory available
> >>> than a
> >>>>> heap size of 31 GB.
> >>>>>> The OS will use additional memory beyond the heap for caching the
> index
> >>>>> data, but that is completely outside of Solr's control. Note that
> 64GB
> >>>>> total memory for a 500GB index is almost certainly not enough memory,
> >>>>> ESPECIALLY if the same server is used for things other than Solr.  I
> >>> wrote
> >>>>> the following wiki page:
> >>>
> https://cwiki.apache.org/confluence/display/SOLR/SolrPerformanceProblems
> >>>>>> Others have recommended that you run Solr on dedicated hardware
> that is
> >>>>> not used for any other purpose.  I concur with that recommendation.
> >>>>>> Thanks,
> >>>>>> Shawn
> >>>>>>
> >>> --
> >>> Charlie Hull - Managing Consultant at OpenSource Connections Limited
> >>> Founding member of The Search Network<http://www.thesearchnetwork.com>
> >>> and co-author of Searching the Enterprise
> >>> <
> >>>
> https://opensourceconnections.com/wp-content/uploads/2020/08/ES_book_final_journal_version.pdf
> >>> tel/fax: +44 (0)8700 118334
> >>> mobile: +44 (0)7767 825828
> >>>
> >>> OpenSource Connections Europe GmbH | Pappelallee 78/79 | 10437 Berlin
> >>> Amtsgericht Charlottenburg | HRB 230712 B
> >>> Geschäftsführer: John M. Woodell | David E. Pugh
> >>> Finanzamt: Berlin Finanzamt für Körperschaften II
> >>>
> >>> --
> >>> This email has been checked for viruses by AVG.
> >>> https://www.avg.com
> >>>
> > --
> > Charlie Hull - Managing Consultant at OpenSource Connections Limited
> > Founding member of The Search Network <http://www.thesearchnetwork.com>
> and co-author of Searching the Enterprise <
> https://opensourceconnections.com/wp-content/uploads/2020/08/ES_book_final_journal_version.pdf
> >
> > tel/fax: +44 (0)8700 118334
> > mobile: +44 (0)7767 825828
> >
> > OpenSource Connections Europe GmbH | Pappelallee 78/79 | 10437 Berlin
> > Amtsgericht Charlottenburg | HRB 230712 B
> > Geschäftsführer: John M. Woodell | David E. Pugh
> > Finanzamt: Berlin Finanzamt für Körperschaften II
> >
> > --
> > This email has been checked for viruses by AVG.
> > https://www.avg.com
>

Re: Solr eats up all the memory

Posted by Dave <ha...@gmail.com>.
Exactly. You could have the best engineer on your continent but the end result is the same, more metal.  I could build a very fast search server for less than a week of my salary so what’s the point of wasting two weeks trying to solve a problem when the solution is literally just right there, a big ssd and a lot of memory, then I could work on things that are actually important rather than try to squeeze blood from a turnip. 

> On Jul 5, 2022, at 6:11 AM, Charlie Hull <ch...@opensourceconnections.com> wrote:
> 
> I think you're missing my point.
> 
> Good engineers, even mediocre ones, are expensive, and the great ones are rare. It's a tedious task chasing tiny performance gains when you know you're limited by the hardware and a bored engineer might just go and look for another job. So if you fail to realise that a capital expense for hardware or hosting is necessary, you run the risk of losing the people that make your search engine work (even if they stay they could also be doing something more useful to your business).
> 
> Charlie
> 
>> On 05/07/2022 10:49, Deepak Goel wrote:
>> If you are tearing your hair out on 'Number of Hours' required for tuning
>> your software, it's time you switch to a better quality performance
>> engineer.
>> 
>> Deepak
>> "The greatness of a nation can be judged by the way its animals are treated
>> - Mahatma Gandhi"
>> 
>> +91 73500 12833
>> deicool@gmail.com
>> 
>> Facebook:https://www.facebook.com/deicool
>> LinkedIn:www.linkedin.com/in/deicool
>> 
>> "Plant a Tree, Go Green"
>> 
>> Make In India :http://www.makeinindia.com/home
>> 
>> 
>> On Tue, Jul 5, 2022 at 3:12 PM Charlie Hull<ch...@opensourceconnections.com>
>> wrote:
>> 
>>> Equally it's not a good management practice to burn engineering hours
>>> trying to optimise performance to avoid spending (often much less) money
>>> on sufficient hardware to do the job. I've seen this happen many times,
>>> sadly.
>>> 
>>> Charlie
>>> 
>>> On 05/07/2022 10:33, Deepak Goel wrote:
>>>> Not a good software engineering practice to beef up the hardware blindly.
>>>> Of Course when you have tuned the software to a point where you can't
>>> tune
>>>> anymore, you can then turn your eyes to hardware.
>>>> 
>>>> Deepak
>>>> "The greatness of a nation can be judged by the way its animals are
>>> treated
>>>> - Mahatma Gandhi"
>>>> 
>>>> +91 73500 12833
>>>> deicool@gmail.com
>>>> 
>>>> Facebook:https://www.facebook.com/deicool
>>>> LinkedIn:www.linkedin.com/in/deicool
>>>> 
>>>> "Plant a Tree, Go Green"
>>>> 
>>>> Make In India :http://www.makeinindia.com/home
>>>> 
>>>> 
>>>> On Tue, Jul 5, 2022 at 1:01 AM Dave<ha...@gmail.com>
>>> wrote:
>>>>> Also for $115 I can buy a terabyte of a Samsung ssd, which helps a lot.
>>> It
>>>>> comes to a point where money on hardware will outweigh money on
>>> engineering
>>>>> man power hours, and still come to the same conclusion. As much ram as
>>> your
>>>>> rack can take and as big and fast of a raid ssd drive it can take.
>>> Remember
>>>>> since solr is always meant to be destroyed and recreated you don’t have
>>> to
>>>>> worry much about hardware failure if you just buy two of everything and
>>>>> have a backup server ready and waiting to take over while the original
>>>>> fails and is reconstructed.
>>>>> 
>>>>>> On Jul 4, 2022, at 1:32 PM, Shawn Heisey<ap...@elyograg.org>   wrote:
>>>>>> 
>>>>>> On 7/4/22 03:01, Mike wrote:
>>>>>>> My Solr index size is around 500GB and I have 64GB of RAM. Solr eats
>>> up
>>>>> all
>>>>>>> the memory and because of that PHP works very, very slowly. What can I
>>>>> do?
>>>>>> Solr is a Java program.  A Java program will never directly use more
>>>>> memory than you specify for the max heap size.  We cannot make any
>>> general
>>>>> recommendations about what heap size you need, because there is a good
>>>>> chance that any recommendation we make would be completely wrong for
>>> your
>>>>> install.  I did see that someone recommended not going above 31G ... and
>>>>> this is good advice.  At 32 GB, Java switches to 64-bit pointers
>>> instead of
>>>>> 32-bit.  So a heap size of 32 GB actually has LESS memory available
>>> than a
>>>>> heap size of 31 GB.
>>>>>> The OS will use additional memory beyond the heap for caching the index
>>>>> data, but that is completely outside of Solr's control. Note that 64GB
>>>>> total memory for a 500GB index is almost certainly not enough memory,
>>>>> ESPECIALLY if the same server is used for things other than Solr.  I
>>> wrote
>>>>> the following wiki page:
>>> https://cwiki.apache.org/confluence/display/SOLR/SolrPerformanceProblems
>>>>>> Others have recommended that you run Solr on dedicated hardware that is
>>>>> not used for any other purpose.  I concur with that recommendation.
>>>>>> Thanks,
>>>>>> Shawn
>>>>>> 
>>> --
>>> Charlie Hull - Managing Consultant at OpenSource Connections Limited
>>> Founding member of The Search Network<http://www.thesearchnetwork.com>
>>> and co-author of Searching the Enterprise
>>> <
>>> https://opensourceconnections.com/wp-content/uploads/2020/08/ES_book_final_journal_version.pdf
>>> tel/fax: +44 (0)8700 118334
>>> mobile: +44 (0)7767 825828
>>> 
>>> OpenSource Connections Europe GmbH | Pappelallee 78/79 | 10437 Berlin
>>> Amtsgericht Charlottenburg | HRB 230712 B
>>> Geschäftsführer: John M. Woodell | David E. Pugh
>>> Finanzamt: Berlin Finanzamt für Körperschaften II
>>> 
>>> --
>>> This email has been checked for viruses by AVG.
>>> https://www.avg.com
>>> 
> -- 
> Charlie Hull - Managing Consultant at OpenSource Connections Limited
> Founding member of The Search Network <http://www.thesearchnetwork.com> and co-author of Searching the Enterprise <https://opensourceconnections.com/wp-content/uploads/2020/08/ES_book_final_journal_version.pdf>
> tel/fax: +44 (0)8700 118334
> mobile: +44 (0)7767 825828
> 
> OpenSource Connections Europe GmbH | Pappelallee 78/79 | 10437 Berlin
> Amtsgericht Charlottenburg | HRB 230712 B
> Geschäftsführer: John M. Woodell | David E. Pugh
> Finanzamt: Berlin Finanzamt für Körperschaften II
> 
> -- 
> This email has been checked for viruses by AVG.
> https://www.avg.com

Re: Solr eats up all the memory

Posted by Charlie Hull <ch...@opensourceconnections.com>.
I think you're missing my point.

Good engineers, even mediocre ones, are expensive, and the great ones 
are rare. It's a tedious task chasing tiny performance gains when you 
know you're limited by the hardware and a bored engineer might just go 
and look for another job. So if you fail to realise that a capital 
expense for hardware or hosting is necessary, you run the risk of losing 
the people that make your search engine work (even if they stay they 
could also be doing something more useful to your business).

Charlie

On 05/07/2022 10:49, Deepak Goel wrote:
> If you are tearing your hair out on 'Number of Hours' required for tuning
> your software, it's time you switch to a better quality performance
> engineer.
>
> Deepak
> "The greatness of a nation can be judged by the way its animals are treated
> - Mahatma Gandhi"
>
> +91 73500 12833
> deicool@gmail.com
>
> Facebook:https://www.facebook.com/deicool
> LinkedIn:www.linkedin.com/in/deicool
>
> "Plant a Tree, Go Green"
>
> Make In India :http://www.makeinindia.com/home
>
>
> On Tue, Jul 5, 2022 at 3:12 PM Charlie Hull<ch...@opensourceconnections.com>
> wrote:
>
>> Equally it's not a good management practice to burn engineering hours
>> trying to optimise performance to avoid spending (often much less) money
>> on sufficient hardware to do the job. I've seen this happen many times,
>> sadly.
>>
>> Charlie
>>
>> On 05/07/2022 10:33, Deepak Goel wrote:
>>> Not a good software engineering practice to beef up the hardware blindly.
>>> Of Course when you have tuned the software to a point where you can't
>> tune
>>> anymore, you can then turn your eyes to hardware.
>>>
>>> Deepak
>>> "The greatness of a nation can be judged by the way its animals are
>> treated
>>> - Mahatma Gandhi"
>>>
>>> +91 73500 12833
>>> deicool@gmail.com
>>>
>>> Facebook:https://www.facebook.com/deicool
>>> LinkedIn:www.linkedin.com/in/deicool
>>>
>>> "Plant a Tree, Go Green"
>>>
>>> Make In India :http://www.makeinindia.com/home
>>>
>>>
>>> On Tue, Jul 5, 2022 at 1:01 AM Dave<ha...@gmail.com>
>> wrote:
>>>> Also for $115 I can buy a terabyte of a Samsung ssd, which helps a lot.
>> It
>>>> comes to a point where money on hardware will outweigh money on
>> engineering
>>>> man power hours, and still come to the same conclusion. As much ram as
>> your
>>>> rack can take and as big and fast of a raid ssd drive it can take.
>> Remember
>>>> since solr is always meant to be destroyed and recreated you don’t have
>> to
>>>> worry much about hardware failure if you just buy two of everything and
>>>> have a backup server ready and waiting to take over while the original
>>>> fails and is reconstructed.
>>>>
>>>>> On Jul 4, 2022, at 1:32 PM, Shawn Heisey<ap...@elyograg.org>   wrote:
>>>>>
>>>>> On 7/4/22 03:01, Mike wrote:
>>>>>> My Solr index size is around 500GB and I have 64GB of RAM. Solr eats
>> up
>>>> all
>>>>>> the memory and because of that PHP works very, very slowly. What can I
>>>> do?
>>>>> Solr is a Java program.  A Java program will never directly use more
>>>> memory than you specify for the max heap size.  We cannot make any
>> general
>>>> recommendations about what heap size you need, because there is a good
>>>> chance that any recommendation we make would be completely wrong for
>> your
>>>> install.  I did see that someone recommended not going above 31G ... and
>>>> this is good advice.  At 32 GB, Java switches to 64-bit pointers
>> instead of
>>>> 32-bit.  So a heap size of 32 GB actually has LESS memory available
>> than a
>>>> heap size of 31 GB.
>>>>> The OS will use additional memory beyond the heap for caching the index
>>>> data, but that is completely outside of Solr's control. Note that 64GB
>>>> total memory for a 500GB index is almost certainly not enough memory,
>>>> ESPECIALLY if the same server is used for things other than Solr.  I
>> wrote
>>>> the following wiki page:
>> https://cwiki.apache.org/confluence/display/SOLR/SolrPerformanceProblems
>>>>> Others have recommended that you run Solr on dedicated hardware that is
>>>> not used for any other purpose.  I concur with that recommendation.
>>>>> Thanks,
>>>>> Shawn
>>>>>
>> --
>> Charlie Hull - Managing Consultant at OpenSource Connections Limited
>> Founding member of The Search Network<http://www.thesearchnetwork.com>
>> and co-author of Searching the Enterprise
>> <
>> https://opensourceconnections.com/wp-content/uploads/2020/08/ES_book_final_journal_version.pdf
>> tel/fax: +44 (0)8700 118334
>> mobile: +44 (0)7767 825828
>>
>> OpenSource Connections Europe GmbH | Pappelallee 78/79 | 10437 Berlin
>> Amtsgericht Charlottenburg | HRB 230712 B
>> Geschäftsführer: John M. Woodell | David E. Pugh
>> Finanzamt: Berlin Finanzamt für Körperschaften II
>>
>> --
>> This email has been checked for viruses by AVG.
>> https://www.avg.com
>>
-- 
Charlie Hull - Managing Consultant at OpenSource Connections Limited
Founding member of The Search Network <http://www.thesearchnetwork.com> 
and co-author of Searching the Enterprise 
<https://opensourceconnections.com/wp-content/uploads/2020/08/ES_book_final_journal_version.pdf>
tel/fax: +44 (0)8700 118334
mobile: +44 (0)7767 825828

OpenSource Connections Europe GmbH | Pappelallee 78/79 | 10437 Berlin
Amtsgericht Charlottenburg | HRB 230712 B
Geschäftsführer: John M. Woodell | David E. Pugh
Finanzamt: Berlin Finanzamt für Körperschaften II

-- 
This email has been checked for viruses by AVG.
https://www.avg.com

Re: Solr eats up all the memory

Posted by Deepak Goel <de...@gmail.com>.
If you are tearing your hair out on 'Number of Hours' required for tuning
your software, it's time you switch to a better quality performance
engineer.

Deepak
"The greatness of a nation can be judged by the way its animals are treated
- Mahatma Gandhi"

+91 73500 12833
deicool@gmail.com

Facebook: https://www.facebook.com/deicool
LinkedIn: www.linkedin.com/in/deicool

"Plant a Tree, Go Green"

Make In India : http://www.makeinindia.com/home


On Tue, Jul 5, 2022 at 3:12 PM Charlie Hull <ch...@opensourceconnections.com>
wrote:

> Equally it's not a good management practice to burn engineering hours
> trying to optimise performance to avoid spending (often much less) money
> on sufficient hardware to do the job. I've seen this happen many times,
> sadly.
>
> Charlie
>
> On 05/07/2022 10:33, Deepak Goel wrote:
> > Not a good software engineering practice to beef up the hardware blindly.
> > Of Course when you have tuned the software to a point where you can't
> tune
> > anymore, you can then turn your eyes to hardware.
> >
> > Deepak
> > "The greatness of a nation can be judged by the way its animals are
> treated
> > - Mahatma Gandhi"
> >
> > +91 73500 12833
> > deicool@gmail.com
> >
> > Facebook:https://www.facebook.com/deicool
> > LinkedIn:www.linkedin.com/in/deicool
> >
> > "Plant a Tree, Go Green"
> >
> > Make In India :http://www.makeinindia.com/home
> >
> >
> > On Tue, Jul 5, 2022 at 1:01 AM Dave<ha...@gmail.com>
> wrote:
> >
> >> Also for $115 I can buy a terabyte of a Samsung ssd, which helps a lot.
> It
> >> comes to a point where money on hardware will outweigh money on
> engineering
> >> man power hours, and still come to the same conclusion. As much ram as
> your
> >> rack can take and as big and fast of a raid ssd drive it can take.
> Remember
> >> since solr is always meant to be destroyed and recreated you don’t have
> to
> >> worry much about hardware failure if you just buy two of everything and
> >> have a backup server ready and waiting to take over while the original
> >> fails and is reconstructed.
> >>
> >>> On Jul 4, 2022, at 1:32 PM, Shawn Heisey<ap...@elyograg.org>  wrote:
> >>>
> >>> On 7/4/22 03:01, Mike wrote:
> >>>> My Solr index size is around 500GB and I have 64GB of RAM. Solr eats
> up
> >> all
> >>>> the memory and because of that PHP works very, very slowly. What can I
> >> do?
> >>> Solr is a Java program.  A Java program will never directly use more
> >> memory than you specify for the max heap size.  We cannot make any
> general
> >> recommendations about what heap size you need, because there is a good
> >> chance that any recommendation we make would be completely wrong for
> your
> >> install.  I did see that someone recommended not going above 31G ... and
> >> this is good advice.  At 32 GB, Java switches to 64-bit pointers
> instead of
> >> 32-bit.  So a heap size of 32 GB actually has LESS memory available
> than a
> >> heap size of 31 GB.
> >>> The OS will use additional memory beyond the heap for caching the index
> >> data, but that is completely outside of Solr's control. Note that 64GB
> >> total memory for a 500GB index is almost certainly not enough memory,
> >> ESPECIALLY if the same server is used for things other than Solr.  I
> wrote
> >> the following wiki page:
> >>>
> https://cwiki.apache.org/confluence/display/SOLR/SolrPerformanceProblems
> >>>
> >>> Others have recommended that you run Solr on dedicated hardware that is
> >> not used for any other purpose.  I concur with that recommendation.
> >>> Thanks,
> >>> Shawn
> >>>
> --
> Charlie Hull - Managing Consultant at OpenSource Connections Limited
> Founding member of The Search Network <http://www.thesearchnetwork.com>
> and co-author of Searching the Enterprise
> <
> https://opensourceconnections.com/wp-content/uploads/2020/08/ES_book_final_journal_version.pdf
> >
> tel/fax: +44 (0)8700 118334
> mobile: +44 (0)7767 825828
>
> OpenSource Connections Europe GmbH | Pappelallee 78/79 | 10437 Berlin
> Amtsgericht Charlottenburg | HRB 230712 B
> Geschäftsführer: John M. Woodell | David E. Pugh
> Finanzamt: Berlin Finanzamt für Körperschaften II
>
> --
> This email has been checked for viruses by AVG.
> https://www.avg.com
>

Re: Solr eats up all the memory

Posted by Charlie Hull <ch...@opensourceconnections.com>.
Equally it's not a good management practice to burn engineering hours 
trying to optimise performance to avoid spending (often much less) money 
on sufficient hardware to do the job. I've seen this happen many times, 
sadly.

Charlie

On 05/07/2022 10:33, Deepak Goel wrote:
> Not a good software engineering practice to beef up the hardware blindly.
> Of Course when you have tuned the software to a point where you can't tune
> anymore, you can then turn your eyes to hardware.
>
> Deepak
> "The greatness of a nation can be judged by the way its animals are treated
> - Mahatma Gandhi"
>
> +91 73500 12833
> deicool@gmail.com
>
> Facebook:https://www.facebook.com/deicool
> LinkedIn:www.linkedin.com/in/deicool
>
> "Plant a Tree, Go Green"
>
> Make In India :http://www.makeinindia.com/home
>
>
> On Tue, Jul 5, 2022 at 1:01 AM Dave<ha...@gmail.com>  wrote:
>
>> Also for $115 I can buy a terabyte of a Samsung ssd, which helps a lot. It
>> comes to a point where money on hardware will outweigh money on engineering
>> man power hours, and still come to the same conclusion. As much ram as your
>> rack can take and as big and fast of a raid ssd drive it can take. Remember
>> since solr is always meant to be destroyed and recreated you don’t have to
>> worry much about hardware failure if you just buy two of everything and
>> have a backup server ready and waiting to take over while the original
>> fails and is reconstructed.
>>
>>> On Jul 4, 2022, at 1:32 PM, Shawn Heisey<ap...@elyograg.org>  wrote:
>>>
>>> On 7/4/22 03:01, Mike wrote:
>>>> My Solr index size is around 500GB and I have 64GB of RAM. Solr eats up
>> all
>>>> the memory and because of that PHP works very, very slowly. What can I
>> do?
>>> Solr is a Java program.  A Java program will never directly use more
>> memory than you specify for the max heap size.  We cannot make any general
>> recommendations about what heap size you need, because there is a good
>> chance that any recommendation we make would be completely wrong for your
>> install.  I did see that someone recommended not going above 31G ... and
>> this is good advice.  At 32 GB, Java switches to 64-bit pointers instead of
>> 32-bit.  So a heap size of 32 GB actually has LESS memory available than a
>> heap size of 31 GB.
>>> The OS will use additional memory beyond the heap for caching the index
>> data, but that is completely outside of Solr's control. Note that 64GB
>> total memory for a 500GB index is almost certainly not enough memory,
>> ESPECIALLY if the same server is used for things other than Solr.  I wrote
>> the following wiki page:
>>> https://cwiki.apache.org/confluence/display/SOLR/SolrPerformanceProblems
>>>
>>> Others have recommended that you run Solr on dedicated hardware that is
>> not used for any other purpose.  I concur with that recommendation.
>>> Thanks,
>>> Shawn
>>>
-- 
Charlie Hull - Managing Consultant at OpenSource Connections Limited
Founding member of The Search Network <http://www.thesearchnetwork.com> 
and co-author of Searching the Enterprise 
<https://opensourceconnections.com/wp-content/uploads/2020/08/ES_book_final_journal_version.pdf>
tel/fax: +44 (0)8700 118334
mobile: +44 (0)7767 825828

OpenSource Connections Europe GmbH | Pappelallee 78/79 | 10437 Berlin
Amtsgericht Charlottenburg | HRB 230712 B
Geschäftsführer: John M. Woodell | David E. Pugh
Finanzamt: Berlin Finanzamt für Körperschaften II

-- 
This email has been checked for viruses by AVG.
https://www.avg.com

Re: Solr eats up all the memory

Posted by Deepak Goel <de...@gmail.com>.
Not a good software engineering practice to beef up the hardware blindly.
Of Course when you have tuned the software to a point where you can't tune
anymore, you can then turn your eyes to hardware.

Deepak
"The greatness of a nation can be judged by the way its animals are treated
- Mahatma Gandhi"

+91 73500 12833
deicool@gmail.com

Facebook: https://www.facebook.com/deicool
LinkedIn: www.linkedin.com/in/deicool

"Plant a Tree, Go Green"

Make In India : http://www.makeinindia.com/home


On Tue, Jul 5, 2022 at 1:01 AM Dave <ha...@gmail.com> wrote:

> Also for $115 I can buy a terabyte of a Samsung ssd, which helps a lot. It
> comes to a point where money on hardware will outweigh money on engineering
> man power hours, and still come to the same conclusion. As much ram as your
> rack can take and as big and fast of a raid ssd drive it can take. Remember
> since solr is always meant to be destroyed and recreated you don’t have to
> worry much about hardware failure if you just buy two of everything and
> have a backup server ready and waiting to take over while the original
> fails and is reconstructed.
>
> > On Jul 4, 2022, at 1:32 PM, Shawn Heisey <ap...@elyograg.org> wrote:
> >
> > On 7/4/22 03:01, Mike wrote:
> >> My Solr index size is around 500GB and I have 64GB of RAM. Solr eats up
> all
> >> the memory and because of that PHP works very, very slowly. What can I
> do?
> >
> > Solr is a Java program.  A Java program will never directly use more
> memory than you specify for the max heap size.  We cannot make any general
> recommendations about what heap size you need, because there is a good
> chance that any recommendation we make would be completely wrong for your
> install.  I did see that someone recommended not going above 31G ... and
> this is good advice.  At 32 GB, Java switches to 64-bit pointers instead of
> 32-bit.  So a heap size of 32 GB actually has LESS memory available than a
> heap size of 31 GB.
> >
> > The OS will use additional memory beyond the heap for caching the index
> data, but that is completely outside of Solr's control. Note that 64GB
> total memory for a 500GB index is almost certainly not enough memory,
> ESPECIALLY if the same server is used for things other than Solr.  I wrote
> the following wiki page:
> >
> > https://cwiki.apache.org/confluence/display/SOLR/SolrPerformanceProblems
> >
> > Others have recommended that you run Solr on dedicated hardware that is
> not used for any other purpose.  I concur with that recommendation.
> >
> > Thanks,
> > Shawn
> >
>

Re: Solr eats up all the memory

Posted by Dave <ha...@gmail.com>.
Also for $115 I can buy a terabyte of a Samsung ssd, which helps a lot. It comes to a point where money on hardware will outweigh money on engineering man power hours, and still come to the same conclusion. As much ram as your rack can take and as big and fast of a raid ssd drive it can take. Remember since solr is always meant to be destroyed and recreated you don’t have to worry much about hardware failure if you just buy two of everything and have a backup server ready and waiting to take over while the original fails and is reconstructed. 

> On Jul 4, 2022, at 1:32 PM, Shawn Heisey <ap...@elyograg.org> wrote:
> 
> On 7/4/22 03:01, Mike wrote:
>> My Solr index size is around 500GB and I have 64GB of RAM. Solr eats up all
>> the memory and because of that PHP works very, very slowly. What can I do?
> 
> Solr is a Java program.  A Java program will never directly use more memory than you specify for the max heap size.  We cannot make any general recommendations about what heap size you need, because there is a good chance that any recommendation we make would be completely wrong for your install.  I did see that someone recommended not going above 31G ... and this is good advice.  At 32 GB, Java switches to 64-bit pointers instead of 32-bit.  So a heap size of 32 GB actually has LESS memory available than a heap size of 31 GB.
> 
> The OS will use additional memory beyond the heap for caching the index data, but that is completely outside of Solr's control. Note that 64GB total memory for a 500GB index is almost certainly not enough memory, ESPECIALLY if the same server is used for things other than Solr.  I wrote the following wiki page:
> 
> https://cwiki.apache.org/confluence/display/SOLR/SolrPerformanceProblems
> 
> Others have recommended that you run Solr on dedicated hardware that is not used for any other purpose.  I concur with that recommendation.
> 
> Thanks,
> Shawn
> 

Re: Solr eats up all the memory

Posted by Dave <ha...@gmail.com>.
Not sure about ku but docker you can simply mount the ssd into the service as an alias with the volumes. Unless you have no control over the metal then this could work?  

> On Jul 6, 2022, at 8:34 PM, dmitri maziuk <dm...@gmail.com> wrote:
> 
> On 2022-07-06 2:59 PM, Shawn Heisey wrote:
>> If the mounted filesystem is one that the OS can cache, and there is enough spare memory, then a lot of the mmap requests that Lucene makes won't ever hit the actual disk.  Most block devices can be cached.  I would expect that to be the case for iSCSI, because it should be a block device from the OS perspective.  I think most network filesystems (NFS, SMB, etc) cannot be locally cached.  They are probably cached on the server side, but then you'd be limited by network bandwidth and latency.  The transfer rate of most 7200RPM SATA disks is a little bit faster than gigabit ethernet.
> 
> This way lieth dark magick and madness, of course, but I'm curious what the optimal config would be for a container infra. For bare metal a large PCIe SSD should be the best bang for the buck, but on kube the "disk" is probably iSCSI volumes and you may not have much control over any OS buffering that may exist. Or not.
> 
> Dima

Re: Solr eats up all the memory

Posted by Shawn Heisey <ap...@elyograg.org>.
On 7/6/22 18:33, dmitri maziuk wrote:
> This way lieth dark magick and madness, of course, but I'm curious 
> what the optimal config would be for a container infra. For bare metal 
> a large PCIe SSD should be the best bang for the buck, but on kube the 
> "disk" is probably iSCSI volumes and you may not have much control 
> over any OS buffering that may exist. Or not.

Main memory is faster than SSD.  I suspect for a containerized setup, 
the disk cache would need to be large on the physical host rather than 
the container, unless the container itself is what mounts the block 
device for the filesystem.

SSD can improve performance when not enough memory is available for the 
disk cache, but if there is sufficient memory and the load is mostly 
queries, the storage can be slow and cheap, because actual disk reads 
will not be common.  SSD would be recommended for heavy indexing.

There is no generic answer for "How much disk cache do I need?" The 
answer depends on the nature of the data and the nature of the requests 
Solr receives, and there is no such thing as "typical".

Thanks,
Shawn


Re: Solr eats up all the memory

Posted by dmitri maziuk <dm...@gmail.com>.
On 2022-07-06 2:59 PM, Shawn Heisey wrote:
> 
> If the mounted filesystem is one that the OS can cache, and there is 
> enough spare memory, then a lot of the mmap requests that Lucene makes 
> won't ever hit the actual disk.  Most block devices can be cached.  I 
> would expect that to be the case for iSCSI, because it should be a block 
> device from the OS perspective.  I think most network filesystems (NFS, 
> SMB, etc) cannot be locally cached.  They are probably cached on the 
> server side, but then you'd be limited by network bandwidth and 
> latency.  The transfer rate of most 7200RPM SATA disks is a little bit 
> faster than gigabit ethernet.

This way lieth dark magick and madness, of course, but I'm curious what 
the optimal config would be for a container infra. For bare metal a 
large PCIe SSD should be the best bang for the buck, but on kube the 
"disk" is probably iSCSI volumes and you may not have much control over 
any OS buffering that may exist. Or not.

Dima

Re: Solr eats up all the memory

Posted by Mikhail Khludnev <mk...@apache.org>.
Hi, Shawn.
I don't realy follow the thread, but want to comment on
>  I think most network filesystems (NFS,SMB, etc) cannot be locally
cached.
I played with EFS recently. It seems to be cached locally pretty much.
After, a searcher opens a mounted index directory, reads rate spikes for
some time, but then, until the commit, read rate is neglectable.

On Wed, Jul 6, 2022 at 10:59 PM Shawn Heisey <ap...@elyograg.org> wrote:

> On 7/6/22 10:59, dmitri maziuk wrote:
> > mmap() doesn't side-step disk access though, dep. on the number of of
> > mmap'ed chunks and chunk size, it can be slow. Especially if your
> > "disk" is an iSCSI volume on a gigabit link to a slow underprovisioned
> > NAS.
>
> If the mounted filesystem is one that the OS can cache, and there is
> enough spare memory, then a lot of the mmap requests that Lucene makes
> won't ever hit the actual disk.  Most block devices can be cached.  I
> would expect that to be the case for iSCSI, because it should be a block
> device from the OS perspective.  I think most network filesystems (NFS,
> SMB, etc) cannot be locally cached.  They are probably cached on the
> server side, but then you'd be limited by network bandwidth and
> latency.  The transfer rate of most 7200RPM SATA disks is a little bit
> faster than gigabit ethernet.
>
> Thanks,
> Shawn
>
>

-- 
Sincerely yours
Mikhail Khludnev

Re: Solr eats up all the memory

Posted by Shawn Heisey <ap...@elyograg.org>.
On 7/6/22 10:59, dmitri maziuk wrote:
> mmap() doesn't side-step disk access though, dep. on the number of of 
> mmap'ed chunks and chunk size, it can be slow. Especially if your 
> "disk" is an iSCSI volume on a gigabit link to a slow underprovisioned 
> NAS.

If the mounted filesystem is one that the OS can cache, and there is 
enough spare memory, then a lot of the mmap requests that Lucene makes 
won't ever hit the actual disk.  Most block devices can be cached.  I 
would expect that to be the case for iSCSI, because it should be a block 
device from the OS perspective.  I think most network filesystems (NFS, 
SMB, etc) cannot be locally cached.  They are probably cached on the 
server side, but then you'd be limited by network bandwidth and 
latency.  The transfer rate of most 7200RPM SATA disks is a little bit 
faster than gigabit ethernet.

Thanks,
Shawn


Re: Solr eats up all the memory

Posted by dmitri maziuk <dm...@gmail.com>.
On 2022-07-05 10:52 PM, Shawn Heisey wrote:
...
> That is an interesting question.  One of the reasons Lucene queries so 
> fast when there is plenty of memory is because it accesses files on disk 
> directly with MMAP, so there is no need to copy the really massive data 
> structures into the heap at all.

mmap() doesn't side-step disk access though, dep. on the number of of 
mmap'ed chunks and chunk size, it can be slow. Especially if your "disk" 
is an iSCSI volume on a gigabit link to a slow underprovisioned NAS.

The flip side is IIRC when mmap'ing to RAM disk, linux kernel is 
supposedly smart enough to realize it's already in RAM and just flip a 
pointer. I.e. if you could have a huge RAM disk...

Dima

Re: Solr eats up all the memory

Posted by Dave <ha...@gmail.com>.
In my experience yea it will just be slow, but it’s hard to test truthfully slow without a couple tens of thousands of searches to measure against. It won’t fail fail, just read the disk. So. Get an ssd to put the index on and then poof, you have a really fast disk to read from 

> On Jul 6, 2022, at 6:38 PM, Christopher Schultz <ch...@christopherschultz.net> wrote:
> 
> Shawn,
> 
>> On 7/5/22 23:52, Shawn Heisey wrote:
>>> On 7/5/2022 3:11 PM, Christopher Schultz wrote:
>>> Well, if you need more than 32GiB, I think the recommendation is to go MUCH HIGHER than 32GiB. If you have a 48GiB machine, maybe restrict to 31GiB of heap, but if you have a TiB, go for it :)
>> I remember reading somewhere, likely for a different program than Solr, that the observed break-even point for 64-bit pointers was 46GB.  The level of debugging and introspection required to calculate that number would be VERY extensive.  Most Solr installs can get by with a max heap size of 31GB or less, even if they are quite large.  For those that need more, I would probably want to see a heap size of at least 64GB.  It is probably better to use SolrCloud and split the index across more servers to keep the heap requirement low than to use a really massive heap.
>>> This is why I said "uhh..." above: the JVM needs more memory than the heap. Sometimes as much as twice that amount, depending upon the workload of the application itself. Measure, measure, measure.
>> It would be interesting to see how much overhead there really is for Solr with various index sizes.  We have seen people have OOM problems when making *only* GC changes ... switching from CMS to G1.  Solr has used G1 out of the box for a while now.
> 
> Anecdotal data point:
> 
> Solr 7.7.3
> Oracle Java 1.8.0_312
> Xms = Xmx = 1024M
> No messing with default GC or other memory settings
> 1 Core, no ZK
> 30s autocommit
> 
> On-disk artifact size:
> $ du -hs /path/to/core
> 723M    /path/to/core
> 
> Live memory info:
> 
> Solr self-reported heap memory used: 205.12 MB [*]
> I reloaded the admin page after writing the "*" note below and it's reporting 55.78 MB heap used.
> 
> Using 'ps' to report real memory usage:
> 
> $ ps aux | grep '\(java\|PID\)'
> USER       PID %CPU %MEM    VSZ   RSS     [...]
> solr     20324  8.1  0.7 6928440 469496   [...]
> 
> So the process space is 6.6G (my 'ps' reports VSZ in kilobytes) and the resident size (aka "actual memory use") is ~460M.
> 
> Solr doesn't report the high-water mark for its heap usage, but the most I've seen so far without a GC kicking it back down is ~200M. So there looks to be about 100% overhead based upon the max heap size.
> 
> I see lots of memory mapped files (both JAR libraries and index-related files) when I do:
> 
> $ sudo lsof -p 20324
> 
> So I suspect a lot of those are mapped-into that resident process space. mmap is one of those things that eats-up tons of non-heap space and doesn't count toward that Xms/Xmx limit. Probably why people run out of memory so frequently because they think they can allocate huge amounts of heap space on their big machine when they really need native memory and not quite so much heap.
> 
> [*] I recently restarted Solr because my personal TLS client key had expired; I had to mint a new one and install it. I'd really love to know if Solr/Jetty can re-load its TLS configuration without restarting. It's a real drag to bounce Solr for something so mundane.
> 
>>> I'm in interested to know what the relation is between on-disk index side and in-memory index size. I would imagine that the on-disk artifacts are fairly slim (only storing what is necessary) and the in-memory representation has all kinds of "waste" (like pointers and all that). Has anyone done a back-of-the-napkin calculation to guess at the in-memory size of an index given the on-disk representation?
>> That is an interesting question.  One of the reasons Lucene queries so fast when there is plenty of memory is because it accesses files on disk directly with MMAP, so there is no need to copy the really massive data structures into the heap at all.
> 
> This is likely where lots of that RSS space is being used in my process detailed above.
> 
>> I believe the OP is having problems because they need a total memory size far larger than 64GB to handle 500GB of index data, and they should also have dedicated hardware for Solr so there is no competition with other software for scarce system resources.
> 
> Having never come close to busting my heap with my tiny 500M (on-disk) index, I'm curious about Solr's expected performance with a huge index and small memory. Will Solr just "get by with what it has" or will it really crap itself if the index is too big? I was kinda hoping it would just perform awfully because it has to keep going back to the disk.
> 
> -chris

Re: Solr eats up all the memory

Posted by Shawn Heisey <ap...@elyograg.org>.
On 7/6/22 16:38, Christopher Schultz wrote:
> Anecdotal data point:

elyograg@bilbo:/usr/local/src$ ps aux | grep '\(java\|PID\)'
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
solr      852288  1.0  9.5 3808952 771204 ?      Sl   Jul03  59:32 java 
-server -Xms512m -Xmx512m [...]

elyograg@bilbo:/usr/local/src$ java -version
openjdk version "11.0.15" 2022-04-19
OpenJDK Runtime Environment (build 11.0.15+10-Ubuntu-0ubuntu0.20.04.1)
OpenJDK 64-Bit Server VM (build 11.0.15+10-Ubuntu-0ubuntu0.20.04.1, 
mixed mode, sharing)

1 core, no ZK.  No autoSoftCommit.
     <autoCommit>
       <maxTime>60000</maxTime>
       <openSearcher>true</openSearcher>
     </autoCommit>

elyograg@bilbo:/usr/local/src$ sudo du -hs /var/solr/data
677M    /var/solr/data

A more detailed du can be found here: 
https://paste.elyograg.org/view/898f3e25

elyograg@bilbo:/usr/local/src$ sudo egrep -v "^#|^$" /etc/default/solr.in.sh
SOLR_PID_DIR="/var/solr"
SOLR_HOME="/var/solr/data"
LOG4J_PROPS="/var/solr/log4j2.xml"
SOLR_LOGS_DIR="/var/solr/logs"
SOLR_PORT="8983"
SOLR_HEAP="512m"
GC_TUNE=" \
   -XX:+UseG1GC \
   -XX:+ParallelRefProcEnabled \
   -XX:MaxGCPauseMillis=100 \
   -XX:+UseLargePages \
   -XX:+AlwaysPreTouch \
   -XX:+ExplicitGCInvokesConcurrent \
   -XX:ParallelGCThreads=2 \
   -XX:+UseStringDeduplication \
   -XX:+UseNUMA \
"
SOLR_JAVA_STACK_SIZE="-Xss1m"
SOLR_ULIMIT_CHECKS=false
SOLR_GZIP_ENABLED=true


Solr version is 10.0.0-SNAPSHOT f8d0d19f981feaf432c4de94187c1677ff48aba5

The hardware is a t3a.large AWS instance, 2 CPUs and 8GB RAM.  It is my 
mailserver, so it is also running postfix, dovecot, haproxy, apache, and 
mysql.

The following screenshot is from a system running Solr 5.x with a 28GB 
heap and over 700 GB of index data.  I no longer have access to this:

https://cwiki.apache.org/confluence/download/attachments/120723332/linux-top-screenshot.png?version=1&modificationDate=1561733774000&api=v2

> Having never come close to busting my heap with my tiny 500M (on-disk) 
> index, I'm curious about Solr's expected performance with a huge index 
> and small memory. Will Solr just "get by with what it has" or will it 
> really crap itself if the index is too big? I was kinda hoping it 
> would just perform awfully because it has to keep going back to the disk.

As long as system resources like the heap, process limit, and open file 
limit are large enough to avoid OOME, Solr should function without 
errors, but if there is insufficient disk cache, performance would be 
terrible.  SSD can help in that situation, but only up to a point ... 
disk cache memory is still a lot faster than SSD.

Thanks,
Shawn


Re: Solr eats up all the memory

Posted by Christopher Schultz <ch...@christopherschultz.net>.
Shawn,

On 7/5/22 23:52, Shawn Heisey wrote:
> On 7/5/2022 3:11 PM, Christopher Schultz wrote:
>> Well, if you need more than 32GiB, I think the recommendation is to go 
>> MUCH HIGHER than 32GiB. If you have a 48GiB machine, maybe restrict to 
>> 31GiB of heap, but if you have a TiB, go for it :)
> 
> I remember reading somewhere, likely for a different program than Solr, 
> that the observed break-even point for 64-bit pointers was 46GB.  The 
> level of debugging and introspection required to calculate that number 
> would be VERY extensive.  Most Solr installs can get by with a max heap 
> size of 31GB or less, even if they are quite large.  For those that need 
> more, I would probably want to see a heap size of at least 64GB.  It is 
> probably better to use SolrCloud and split the index across more servers 
> to keep the heap requirement low than to use a really massive heap.
> 
>> This is why I said "uhh..." above: the JVM needs more memory than the 
>> heap. Sometimes as much as twice that amount, depending upon the 
>> workload of the application itself. Measure, measure, measure.
> 
> It would be interesting to see how much overhead there really is for 
> Solr with various index sizes.  We have seen people have OOM problems 
> when making *only* GC changes ... switching from CMS to G1.  Solr has 
> used G1 out of the box for a while now.

Anecdotal data point:

Solr 7.7.3
Oracle Java 1.8.0_312
Xms = Xmx = 1024M
No messing with default GC or other memory settings
1 Core, no ZK
30s autocommit

On-disk artifact size:
$ du -hs /path/to/core
723M	/path/to/core

Live memory info:

Solr self-reported heap memory used: 205.12 MB [*]
I reloaded the admin page after writing the "*" note below and it's 
reporting 55.78 MB heap used.

Using 'ps' to report real memory usage:

$ ps aux | grep '\(java\|PID\)'
USER       PID %CPU %MEM    VSZ   RSS     [...]
solr     20324  8.1  0.7 6928440 469496   [...]

So the process space is 6.6G (my 'ps' reports VSZ in kilobytes) and the 
resident size (aka "actual memory use") is ~460M.

Solr doesn't report the high-water mark for its heap usage, but the most 
I've seen so far without a GC kicking it back down is ~200M. So there 
looks to be about 100% overhead based upon the max heap size.

I see lots of memory mapped files (both JAR libraries and index-related 
files) when I do:

$ sudo lsof -p 20324

So I suspect a lot of those are mapped-into that resident process space. 
mmap is one of those things that eats-up tons of non-heap space and 
doesn't count toward that Xms/Xmx limit. Probably why people run out of 
memory so frequently because they think they can allocate huge amounts 
of heap space on their big machine when they really need native memory 
and not quite so much heap.

[*] I recently restarted Solr because my personal TLS client key had 
expired; I had to mint a new one and install it. I'd really love to know 
if Solr/Jetty can re-load its TLS configuration without restarting. It's 
a real drag to bounce Solr for something so mundane.

>> I'm in interested to know what the relation is between on-disk index 
>> side and in-memory index size. I would imagine that the on-disk 
>> artifacts are fairly slim (only storing what is necessary) and the 
>> in-memory representation has all kinds of "waste" (like pointers and 
>> all that). Has anyone done a back-of-the-napkin calculation to guess 
>> at the in-memory size of an index given the on-disk representation?
> 
> That is an interesting question.  One of the reasons Lucene queries so 
> fast when there is plenty of memory is because it accesses files on disk 
> directly with MMAP, so there is no need to copy the really massive data 
> structures into the heap at all.

This is likely where lots of that RSS space is being used in my process 
detailed above.

> I believe the OP is having problems because they need a total memory 
> size far larger than 64GB to handle 500GB of index data, and they should 
> also have dedicated hardware for Solr so there is no competition with 
> other software for scarce system resources.

Having never come close to busting my heap with my tiny 500M (on-disk) 
index, I'm curious about Solr's expected performance with a huge index 
and small memory. Will Solr just "get by with what it has" or will it 
really crap itself if the index is too big? I was kinda hoping it would 
just perform awfully because it has to keep going back to the disk.

-chris

Re: Solr eats up all the memory

Posted by Shawn Heisey <ap...@elyograg.org>.
On 7/5/2022 3:11 PM, Christopher Schultz wrote:
> Well, if you need more than 32GiB, I think the recommendation is to go 
> MUCH HIGHER than 32GiB. If you have a 48GiB machine, maybe restrict to 
> 31GiB of heap, but if you have a TiB, go for it :)

I remember reading somewhere, likely for a different program than Solr, 
that the observed break-even point for 64-bit pointers was 46GB.  The 
level of debugging and introspection required to calculate that number 
would be VERY extensive.  Most Solr installs can get by with a max heap 
size of 31GB or less, even if they are quite large.  For those that need 
more, I would probably want to see a heap size of at least 64GB.  It is 
probably better to use SolrCloud and split the index across more servers 
to keep the heap requirement low than to use a really massive heap.

> This is why I said "uhh..." above: the JVM needs more memory than the 
> heap. Sometimes as much as twice that amount, depending upon the 
> workload of the application itself. Measure, measure, measure.

It would be interesting to see how much overhead there really is for 
Solr with various index sizes.  We have seen people have OOM problems 
when making *only* GC changes ... switching from CMS to G1.  Solr has 
used G1 out of the box for a while now.

> I'm in interested to know what the relation is between on-disk index 
> side and in-memory index size. I would imagine that the on-disk 
> artifacts are fairly slim (only storing what is necessary) and the 
> in-memory representation has all kinds of "waste" (like pointers and 
> all that). Has anyone done a back-of-the-napkin calculation to guess 
> at the in-memory size of an index given the on-disk representation?

That is an interesting question.  One of the reasons Lucene queries so 
fast when there is plenty of memory is because it accesses files on disk 
directly with MMAP, so there is no need to copy the really massive data 
structures into the heap at all.

I believe the OP is having problems because they need a total memory 
size far larger than 64GB to handle 500GB of index data, and they should 
also have dedicated hardware for Solr so there is no competition with 
other software for scarce system resources.

Thanks,
Shawn


Re: Solr eats up all the memory

Posted by Christopher Schultz <ch...@christopherschultz.net>.
Shawn,

On 7/4/22 13:31, Shawn Heisey wrote:
> On 7/4/22 03:01, Mike wrote:
>> My Solr index size is around 500GB and I have 64GB of RAM. Solr eats 
>> up all
>> the memory and because of that PHP works very, very slowly. What can I 
>> do?
> 
> Solr is a Java program.  A Java program will never directly use more 
> memory than you specify for the max heap size.

Uhh....

> We cannot make any 
> general recommendations about what heap size you need, because there is 
> a good chance that any recommendation we make would be completely wrong 
> for your install.  I did see that someone recommended not going above 
> 31G ... and this is good advice.  At 32 GB, Java switches to 64-bit 
> pointers instead of 32-bit.  So a heap size of 32 GB actually has LESS 
> memory available than a heap size of 31 GB.

Well, if you need more than 32GiB, I think the recommendation is to go 
MUCH HIGHER than 32GiB. If you have a 48GiB machine, maybe restrict to 
31GiB of heap, but if you have a TiB, go for it :)

> The OS will use additional memory beyond the heap for caching the index 
> data, but that is completely outside of Solr's control.

This is why I said "uhh..." above: the JVM needs more memory than the 
heap. Sometimes as much as twice that amount, depending upon the 
workload of the application itself. Measure, measure, measure.

> Note that 64GB total memory for a 500GB index is almost certainly not
> enough memory, ESPECIALLY if the same server is used for things other
> than Solr.

I'm in interested to know what the relation is between on-disk index 
side and in-memory index size. I would imagine that the on-disk 
artifacts are fairly slim (only storing what is necessary) and the 
in-memory representation has all kinds of "waste" (like pointers and all 
that). Has anyone done a back-of-the-napkin calculation to guess at the 
in-memory size of an index given the on-disk representation?

-chris

Re: Solr eats up all the memory

Posted by Shawn Heisey <ap...@elyograg.org>.
On 7/4/22 03:01, Mike wrote:
> My Solr index size is around 500GB and I have 64GB of RAM. Solr eats up all
> the memory and because of that PHP works very, very slowly. What can I do?

Solr is a Java program.  A Java program will never directly use more 
memory than you specify for the max heap size.  We cannot make any 
general recommendations about what heap size you need, because there is 
a good chance that any recommendation we make would be completely wrong 
for your install.  I did see that someone recommended not going above 
31G ... and this is good advice.  At 32 GB, Java switches to 64-bit 
pointers instead of 32-bit.  So a heap size of 32 GB actually has LESS 
memory available than a heap size of 31 GB.

The OS will use additional memory beyond the heap for caching the index 
data, but that is completely outside of Solr's control. Note that 64GB 
total memory for a 500GB index is almost certainly not enough memory, 
ESPECIALLY if the same server is used for things other than Solr.  I 
wrote the following wiki page:

https://cwiki.apache.org/confluence/display/SOLR/SolrPerformanceProblems

Others have recommended that you run Solr on dedicated hardware that is 
not used for any other purpose.  I concur with that recommendation.

Thanks,
Shawn


Re: Solr eats up all the memory

Posted by naga pradeep dhulipalla <na...@gmail.com>.
Can you please inscribe me from this mailing list?

On Mon, Jul 4, 2022 at 7:28 PM David Hastings <ha...@gmail.com>
wrote:

> in my experience, yes, solr should have its own hardware, and be allowed to
> eat all of it.  never give it more than 31gb of jvm heap, and give it as
> much memory as possible.  64GB should work fine but I can just go on amazon
> and buy another 128GB for less than $500, the more the better, less than
> $1000 will get you into the happy place of over 250gb,
>
> On Mon, Jul 4, 2022 at 5:55 AM Deepak Goel <de...@gmail.com> wrote:
>
> > I wonder if this would be helpful:
> >
> > https://solr.apache.org/guide/6_6/indexconfig-in-solrconfig.html
> >
> > https://solr.apache.org/guide/6_6/jvm-settings.html
> >
> >
> > Deepak
> > "The greatness of a nation can be judged by the way its animals are
> treated
> > - Mahatma Gandhi"
> >
> > +91 73500 12833
> > deicool@gmail.com
> >
> > Facebook: https://www.facebook.com/deicool
> > LinkedIn: www.linkedin.com/in/deicool
> >
> > "Plant a Tree, Go Green"
> >
> > Make In India : http://www.makeinindia.com/home
> >
> >
> > On Mon, Jul 4, 2022 at 3:18 PM Thomas Corthals <th...@klascement.net>
> > wrote:
> >
> > > Hello Mike,
> > >
> > > If possible, run Solr on a separate machine. You're still going to need
> > to
> > > spec it out and configure it to your needs, but at least your client
> code
> > > will keep running.
> > >
> > > Thomas
> > >
> > > Op ma 4 jul. 2022 11:01 schreef Mike <mz...@gmail.com>:
> > >
> > > > Hello!
> > > >
> > > > My Solr index size is around 500GB and I have 64GB of RAM. Solr eats
> up
> > > all
> > > > the memory and because of that PHP works very, very slowly. What can
> I
> > > do?
> > > >
> > > > Thanks
> > > >
> > > > Mike
> > > >
> > >
> >
>

Re: Solr eats up all the memory

Posted by David Hastings <ha...@gmail.com>.
in my experience, yes, solr should have its own hardware, and be allowed to
eat all of it.  never give it more than 31gb of jvm heap, and give it as
much memory as possible.  64GB should work fine but I can just go on amazon
and buy another 128GB for less than $500, the more the better, less than
$1000 will get you into the happy place of over 250gb,

On Mon, Jul 4, 2022 at 5:55 AM Deepak Goel <de...@gmail.com> wrote:

> I wonder if this would be helpful:
>
> https://solr.apache.org/guide/6_6/indexconfig-in-solrconfig.html
>
> https://solr.apache.org/guide/6_6/jvm-settings.html
>
>
> Deepak
> "The greatness of a nation can be judged by the way its animals are treated
> - Mahatma Gandhi"
>
> +91 73500 12833
> deicool@gmail.com
>
> Facebook: https://www.facebook.com/deicool
> LinkedIn: www.linkedin.com/in/deicool
>
> "Plant a Tree, Go Green"
>
> Make In India : http://www.makeinindia.com/home
>
>
> On Mon, Jul 4, 2022 at 3:18 PM Thomas Corthals <th...@klascement.net>
> wrote:
>
> > Hello Mike,
> >
> > If possible, run Solr on a separate machine. You're still going to need
> to
> > spec it out and configure it to your needs, but at least your client code
> > will keep running.
> >
> > Thomas
> >
> > Op ma 4 jul. 2022 11:01 schreef Mike <mz...@gmail.com>:
> >
> > > Hello!
> > >
> > > My Solr index size is around 500GB and I have 64GB of RAM. Solr eats up
> > all
> > > the memory and because of that PHP works very, very slowly. What can I
> > do?
> > >
> > > Thanks
> > >
> > > Mike
> > >
> >
>

Re: Solr eats up all the memory

Posted by Deepak Goel <de...@gmail.com>.
I wonder if this would be helpful:

https://solr.apache.org/guide/6_6/indexconfig-in-solrconfig.html

https://solr.apache.org/guide/6_6/jvm-settings.html


Deepak
"The greatness of a nation can be judged by the way its animals are treated
- Mahatma Gandhi"

+91 73500 12833
deicool@gmail.com

Facebook: https://www.facebook.com/deicool
LinkedIn: www.linkedin.com/in/deicool

"Plant a Tree, Go Green"

Make In India : http://www.makeinindia.com/home


On Mon, Jul 4, 2022 at 3:18 PM Thomas Corthals <th...@klascement.net>
wrote:

> Hello Mike,
>
> If possible, run Solr on a separate machine. You're still going to need to
> spec it out and configure it to your needs, but at least your client code
> will keep running.
>
> Thomas
>
> Op ma 4 jul. 2022 11:01 schreef Mike <mz...@gmail.com>:
>
> > Hello!
> >
> > My Solr index size is around 500GB and I have 64GB of RAM. Solr eats up
> all
> > the memory and because of that PHP works very, very slowly. What can I
> do?
> >
> > Thanks
> >
> > Mike
> >
>

Re: Solr eats up all the memory

Posted by Thomas Corthals <th...@klascement.net>.
Hello Mike,

If possible, run Solr on a separate machine. You're still going to need to
spec it out and configure it to your needs, but at least your client code
will keep running.

Thomas

Op ma 4 jul. 2022 11:01 schreef Mike <mz...@gmail.com>:

> Hello!
>
> My Solr index size is around 500GB and I have 64GB of RAM. Solr eats up all
> the memory and because of that PHP works very, very slowly. What can I do?
>
> Thanks
>
> Mike
>