You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@lucene.apache.org by Lyuba Romanchuk <ly...@gmail.com> on 2013/04/09 10:47:45 UTC

Adding new functionality to avoid "java.lang.OutOfMemoryError: Java heap space" exception

Hi all,

We run solr (4.2 and 5.0) in a real time environment with big data. Each
day two Solr cores are generated that can reach ~8-10g, depending on the
insertion rates and on different hardware.

Currently, all cores are loaded on solr startup.

The query rate is not high but the response must be quick and must be
returned even for old data and over a large time frame.

There are a lot of simple queries (facet/facet.pivot for small distributed
fields) but there are also heavy queries like facet.pivot for a large-scale
distributed fields. We use distributed search to query the cores and,
usually, the query over 1-2 weeks (around 7-28 cores).

After some large queries (with facet.pivot for wide distributed fields) we
sometimes encounter a "java.lang.OutOfMemoryError: Java heap space"
exception:.

The software is to be deployed to customer sites so increasing memory would
not always be possible, and the customers may want to get slower responses
for the larger queries, if we can provide them.

We looked at the LotsOfCores functionality that was added in 4.1 and 4.2.
It enables defining an upper limit of online cores and unloading them when
the cache gets full on a LRU basis. However in our case it seems a more
general use case is needed:

* Only cores that are used for updates/inserts must be loaded at all times.
Other cores, which are queried only, should be loaded / unloaded on demand
while the query runs, until completion – according to memory demands.

* Each facet, facet.pivot must be estimated for memory consumption. In case
there is not enough memory to run the query for all cores concurrently it
must be separated into sequential queries, unloading already queried or
irrelevant cores (but not permanent cores) and loading older cores to
complete the query.

* Occasionally, the oldest cores should be unloaded according to a
configurable policy (for example, one type of high volume cores will be
kept loaded for 1 week, while smaller cores can remain loaded for a month).
The policy will allow for data we know is queried less but is higher volume
to be kept live over shorter time periods.

We are considering adding the following functionality to Solr (optional –
turned on by new configs):

The flow of SolrCore.execute() function will be changed:


   - Change status of the core to “USED”
   - Call waitForResource(SolrRequestHandler, SolrQueryRequest) function
      - estimate the required memory for this query/handler on this core
      - if there is no enough free resources to run the query then
         - if all cores are permanent and can’t be unloaded then
            - throw a "OutOfMemoryError " exception // here the status of
            the core should be changed to “UNUSED”
         - else
            -  try to unload unused, not permanent cores
            - if unloading unused cores didn’t release enough resources and
            no core can be unloaded then
               - throw an "OutOfMemoryError " exception // here the status
               of the core should be changed to “UNUSED”
            - if unloading unused cores didn’t release enough resources and
            there are cores that can be unloaded then
            - wait with timeout till some resource is released
               - check again until the required resource is available or
               the exception is thrown
               - reserve the resource
   - Call the current SolrCore.execute()
   - Change status of the core to “UNUSED”

We would like to get some initial feedback on the design / functionality
we’re proposing as we feel this really benefits real-time, high volume
indexing systems such as ours. We are also happy to contribute the code
back if you feel there is a need for this functionality.

Best regards,

Lyuba

Re: Adding new functionality to avoid "java.lang.OutOfMemoryError: Java heap space" exception

Posted by Erick Erickson <er...@gmail.com>.

On a quick glance, I think this would be difficult. How could one
estimate memory without loading the core? Facets in particular are
sensitive to the number of unique terms in the field. One could
probably work it backwards, that is load the cores as necessary and
_measure_ the memory consumption. You'd then have to store that
information someplace though.

It seems like you can get relatively close to this by specifying a set
of cores with transient="false" and the rest with transient="true",
but that's certainly not going to satisfy the complex requirements
you've outlined.

That said, it feels like your design is a band-aid, client are going
to then _still_ put too much information on too little hardware, but
you know your problem space better than I do.

But before you start working there, be aware that this code is
evolving fairly quickly. SOLR-4662 should have the structure in
reasonably stable condition, and I hope to get that done this coming
weekend. You might want to wait until that gets committed to do more
than exploratory work as the code base may change out from underneath
you.

Good luck!
Erick

On Tue, Apr 9, 2013 at 7:02 AM, Lyuba Romanchuk
<ly...@gmail.com> wrote:
> It seems like bullets don't look nice then I'm sending explanation without
> bullets.
>
> The flow of SolrCore.execute() function will be changed:
>
> Change the status of the core to “USED” and call
> waitForResource(SolrRequestHandler, SolrQueryRequest) function, after that
> perform the current SolrCore.execute() flow and change status of the core to
> “UNUSED”.
>
> In waitForResource(SolrRequestHandler, SolrQueryRequest) function,
> initially, estimate the required memory for this query/handler on this core.
> If there is no enough free resources to run the query and after unloading
> all unused, not permanent cores still there is no enough resource throw an
> "OutOfMemoryError " exception and change the status of the core to “UNUSED”;
> else wait with timeout till some resource is released and then check again
> until the required resource is available or the exception is thrown.
>
> Best regards,
>
> Lyuba
>
>
> ---------- Forwarded message ----------
> From: Lyuba Romanchuk <ly...@gmail.com>
> Date: Tue, Apr 9, 2013 at 11:47 AM
> Subject: Adding new functionality to avoid "java.lang.OutOfMemoryError: Java
> heap space" exception
> To: dev@lucene.apache.org
>
>
> Hi all,
>
> We run solr (4.2 and 5.0) in a real time environment with big data. Each day
> two Solr cores are generated that can reach ~8-10g, depending on the
> insertion rates and on different hardware.
>
> Currently, all cores are loaded on solr startup.
>
> The query rate is not high but the response must be quick and must be
> returned even for old data and over a large time frame.
>
> There are a lot of simple queries (facet/facet.pivot for small distributed
> fields) but there are also heavy queries like facet.pivot for a large-scale
> distributed fields. We use distributed search to query the cores and,
> usually, the query over 1-2 weeks (around 7-28 cores).
>
> After some large queries (with facet.pivot for wide distributed fields) we
> sometimes encounter a "java.lang.OutOfMemoryError: Java heap space"
> exception:.
>
> The software is to be deployed to customer sites so increasing memory would
> not always be possible, and the customers may want to get slower responses
> for the larger queries, if we can provide them.
>
> We looked at the LotsOfCores functionality that was added in 4.1 and 4.2. It
> enables defining an upper limit of online cores and unloading them when the
> cache gets full on a LRU basis. However in our case it seems a more general
> use case is needed:
>
> * Only cores that are used for updates/inserts must be loaded at all times.
> Other cores, which are queried only, should be loaded / unloaded on demand
> while the query runs, until completion – according to memory demands.
>
> * Each facet, facet.pivot must be estimated for memory consumption. In case
> there is not enough memory to run the query for all cores concurrently it
> must be separated into sequential queries, unloading already queried or
> irrelevant cores (but not permanent cores) and loading older cores to
> complete the query.
>
> * Occasionally, the oldest cores should be unloaded according to a
> configurable policy (for example, one type of high volume cores will be kept
> loaded for 1 week, while smaller cores can remain loaded for a month). The
> policy will allow for data we know is queried less but is higher volume to
> be kept live over shorter time periods.
>
> We are considering adding the following functionality to Solr (optional –
> turned on by new configs):
>
> The flow of SolrCore.execute() function will be changed:
>
> Change status of the core to “USED”
> Call waitForResource(SolrRequestHandler, SolrQueryRequest) function
>
> estimate the required memory for this query/handler on this core
> if there is no enough free resources to run the query then
>
> if all cores are permanent and can’t be unloaded then
>
> throw a "OutOfMemoryError " exception // here the status of the core should
> be changed to “UNUSED”
>
> else
>
>  try to unload unused, not permanent cores
> if unloading unused cores didn’t release enough resources and no core can be
> unloaded then
>
> throw an "OutOfMemoryError " exception // here the status of the core should
> be changed to “UNUSED”
>
> if unloading unused cores didn’t release enough resources and there are
> cores that can be unloaded then
>
> wait with timeout till some resource is released
> check again until the required resource is available or the exception is
> thrown
>
> reserve the resource
>
> Call the current SolrCore.execute()
> Change status of the core to “UNUSED”
>
> We would like to get some initial feedback on the design / functionality
> we’re proposing as we feel this really benefits real-time, high volume
> indexing systems such as ours. We are also happy to contribute the code back
> if you feel there is a need for this functionality.
>
> Best regards,
>
> Lyuba
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Fwd: Adding new functionality to avoid "java.lang.OutOfMemoryError: Java heap space" exception

Posted by Lyuba Romanchuk <ly...@gmail.com>.

It seems like bullets don't look nice then I'm sending explanation without
bullets.

The flow of SolrCore.execute() function will be changed:

Change the status of the core to “USED” and call
waitForResource(SolrRequestHandler, SolrQueryRequest) function, after that
perform the current SolrCore.execute() flow and change status of the core
to “UNUSED”.

In waitForResource(SolrRequestHandler, SolrQueryRequest) function,
initially, estimate the required memory for this query/handler on this
core. If there is no enough free resources to run the query and after
unloading all unused, not permanent cores still there is no enough resource
throw an "OutOfMemoryError " exception and change the status of the core to
“UNUSED”; else wait with timeout till some resource is released and then
check again until the required resource is available or the exception is
thrown.

Best regards,

Lyuba

---------- Forwarded message ----------
From: Lyuba Romanchuk <ly...@gmail.com>
Date: Tue, Apr 9, 2013 at 11:47 AM
Subject: Adding new functionality to avoid "java.lang.OutOfMemoryError:
Java heap space" exception
To: dev@lucene.apache.org


Hi all,

We run solr (4.2 and 5.0) in a real time environment with big data. Each
day two Solr cores are generated that can reach ~8-10g, depending on the
insertion rates and on different hardware.

Currently, all cores are loaded on solr startup.

The query rate is not high but the response must be quick and must be
returned even for old data and over a large time frame.

There are a lot of simple queries (facet/facet.pivot for small distributed
fields) but there are also heavy queries like facet.pivot for a large-scale
distributed fields. We use distributed search to query the cores and,
usually, the query over 1-2 weeks (around 7-28 cores).

After some large queries (with facet.pivot for wide distributed fields) we
sometimes encounter a "java.lang.OutOfMemoryError: Java heap space"
exception:.

The software is to be deployed to customer sites so increasing memory would
not always be possible, and the customers may want to get slower responses
for the larger queries, if we can provide them.

We looked at the LotsOfCores functionality that was added in 4.1 and 4.2.
It enables defining an upper limit of online cores and unloading them when
the cache gets full on a LRU basis. However in our case it seems a more
general use case is needed:

* Only cores that are used for updates/inserts must be loaded at all times.
Other cores, which are queried only, should be loaded / unloaded on demand
while the query runs, until completion – according to memory demands.

* Each facet, facet.pivot must be estimated for memory consumption. In case
there is not enough memory to run the query for all cores concurrently it
must be separated into sequential queries, unloading already queried or
irrelevant cores (but not permanent cores) and loading older cores to
complete the query.

* Occasionally, the oldest cores should be unloaded according to a
configurable policy (for example, one type of high volume cores will be
kept loaded for 1 week, while smaller cores can remain loaded for a month).
The policy will allow for data we know is queried less but is higher volume
to be kept live over shorter time periods.

We are considering adding the following functionality to Solr (optional –
turned on by new configs):

The flow of SolrCore.execute() function will be changed:


   - Change status of the core to “USED”
   - Call waitForResource(SolrRequestHandler, SolrQueryRequest) function
      - estimate the required memory for this query/handler on this core
      - if there is no enough free resources to run the query then
         - if all cores are permanent and can’t be unloaded then
            - throw a "OutOfMemoryError " exception // here the status of
            the core should be changed to “UNUSED”
         - else
            -  try to unload unused, not permanent cores
            - if unloading unused cores didn’t release enough resources and
            no core can be unloaded then
               - throw an "OutOfMemoryError " exception // here the status
               of the core should be changed to “UNUSED”
            - if unloading unused cores didn’t release enough resources and
            there are cores that can be unloaded then
            - wait with timeout till some resource is released
               - check again until the required resource is available or
               the exception is thrown
               - reserve the resource
   - Call the current SolrCore.execute()
   - Change status of the core to “UNUSED”

We would like to get some initial feedback on the design / functionality
we’re proposing as we feel this really benefits real-time, high volume
indexing systems such as ours. We are also happy to contribute the code
back if you feel there is a need for this functionality.

Best regards,

Lyuba