You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@ignite.apache.org by eugene miretsky <eu...@gmail.com> on 2018/08/22 18:30:25 UTC

How much heap to allocate

Hi,

I am getting the following warning when starting Ignite - "

Nodes started on local machine require more than 20% of physical RAM what
can lead to significant slowdown due to swapping
"

The 20% is a typo in version 2.5, it should be 80%.

We have increased the max size of the default region to 70% of the
available memory on the instance (since that's the only region we use at
the moment).

From reading the code
<https://github.com/apache/ignite/blob/0681d8725d096272815dbd20c078871b27449895/modules/core/src/main/java/org/apache/ignite/internal/IgniteKernal.java>
that
generates the error, it seems like
1) Ignite adds all the memory across all nodes to check if it is above the
safeToUse threshold. I would expect the check to be done per node
2) totalOffheap seems to be the sum of the maxSizes of all regions, and
totalHeap retrieved from the JVM configs. ingnite.sh sets  -Xmx200g.

Assuming we are not enabling on-heap caching, what should we set the heap
size to?

Cheers,
Eugene

Re: How much heap to allocate

Posted by eugene miretsky <eu...@gmail.com>.

My understanding is that lazy loading doesn't work with group_by.

On Tue, Sep 18, 2018 at 10:11 AM Mikhail <mi...@gmail.com>
wrote:

> Hi Eugene,
>
> >For #2: wouldn't H2 need to bring the data into the heap to make the
> queries?
> > Or at least some of the date to do the group_by and sum operation?
>
> yes, ignite will bring data from off-heap to heap, sometimes if data set is
> too big for heap memory you need to set lazy flag for your query:
>
> https://apacheignite-sql.readme.io/docs/performance-and-debugging#result-set-lazy-load
>
> Thanks,
> Mike.
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>

Re: How much heap to allocate

Posted by Ilya Kasnacheev <il...@gmail.com>.

Hello!

Yes, Ignite server node will send batches of data to client.

Regards,
-- 
Ilya Kasnacheev


ср, 19 сент. 2018 г. в 08:17, Ray <ra...@cisco.com>:

> Hi Mikhail,
>
> Can you explain how is lazy loading working when I use this sql "select *
> from table" to query a big table which don't fit in on heap memory?
> Does Ignite send part of the result set to the client?
> If Ignite is still sending whole result set back to the client, how can
> lazy
> load avoid loading full result set on heap?
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>

Re: How much heap to allocate

Posted by Ray <ra...@cisco.com>.

Hi Mikhail,

Can you explain how is lazy loading working when I use this sql "select *
from table" to query a big table which don't fit in on heap memory?
Does Ignite send part of the result set to the client? 
If Ignite is still sending whole result set back to the client, how can lazy
load avoid loading full result set on heap?  



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: How much heap to allocate

Posted by Mikhail <mi...@gmail.com>.

Hi Eugene,

>For #2: wouldn't H2 need to bring the data into the heap to make the
queries?
> Or at least some of the date to do the group_by and sum operation? 

yes, ignite will bring data from off-heap to heap, sometimes if data set is
too big for heap memory you need to set lazy flag for your query:
https://apacheignite-sql.readme.io/docs/performance-and-debugging#result-set-lazy-load

Thanks,
Mike.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: How much heap to allocate

Posted by eugene miretsky <eu...@gmail.com>.

Thanks!

For #2: wouldn't H2 need to bring the data into the heap to make the
queries? Or at least some of the date to do the group_by and sum operation?

On Mon, Sep 10, 2018 at 6:19 AM Vladimir Ozerov <vo...@gridgain.com>
wrote:

> Hi Eugene,
>
> Answering your questions:
> 1) Grouping is performed on both mapper and reducer (coordinator). If you
> group be affinity key, you may try setting "SqlFieldsQuery.colocated=true"
> to bypass grouping on reducer
> 2) For this specific query H2 will store (customer_id, count(*),
> sum(views)) for every customer_id. It is hard to guess how much space it
> would take in heap, but I think it would be ~50-100 bytes per customer_id.
> So if you have N customers, it would be (100 * N) bytes
> 3) Please see
> https://apacheignite-sql.readme.io/docs/performance-and-debugging
>
> Vladimir.
>
> On Thu, Aug 30, 2018 at 5:57 PM eugene miretsky <eu...@gmail.com>
> wrote:
>
>> Thanks against for the detailed response!
>>
>> Our main use case is preforming large SQL queries over tables with 200M+
>> rows  - wanted to give you a bit more details and context you can pass along
>>
>> A simple example would be:
>>
>>    - Table: customer_id, date, category, views, clicks ( pkey =
>>    "customer_id, date", affinity key = date )
>>    - Query: SELECT count(*) where date < X AND categroy in (C1, C2, C3)
>>    GROUP BY customer_id HAVING SUM(views) > 20
>>
>> My main concernse are
>> 1) How is the group by performed. You mentioend that it is performend on
>> the coordinator, I was coping that singe we are grouping using an colomn
>> that is an affintiy key, each node will be able to do it's own group by
>> 2) How much heap should I allocate for the group by stage
>> 3) General performance tips
>>
>> Cheers,
>> Eugene
>>
>>
>> On Thu, Aug 30, 2018 at 1:32 AM Denis Magda <dm...@apache.org> wrote:
>>
>>> Eugene,
>>>
>>> Just want to be sure you know about the existence of the following pages
>>> which elaborate on Ignite memory architecture in details:
>>>
>>>    -
>>>    https://cwiki.apache.org/confluence/display/IGNITE/Ignite+Durable+Memory+-+under+the+hood#IgniteDurableMemory-underthehood-Entriesandpagesindurablememory
>>>    -
>>>    https://cwiki.apache.org/confluence/display/IGNITE/Ignite+Persistent+Store+-+under+the+hood
>>>
>>>
>>>
>>>> 1) Are indexs loaded into heap (when used)?
>>>>
>>>
>>> Something might be copied to disk but in most of the cases we perform
>>> comparisons and other operations directly off-heap.
>>> See org.apache.ignite.internal.processors.query.h2.database.InlineIndexHelper
>>> and related classes.
>>>
>>> 2) Are full pages loaded into heap, or only the matching records?
>>>>
>>>
>>> Matching records (result set) are presently loaded. The pages are not.
>>>
>>>
>>>> 3) When the query needs more processing than the exisiting index
>>>> (non-indexed columns, groupBy, aggreag) where/how does it happen?
>>>>
>>>
>>> We will be doing a full scan. Grouping and aggregations are finalized on
>>> the query coordinator which needs to get a full result set.
>>>
>>> 4) How is the query coordinator chosen? Is it the client node? How about
>>>> when using the web console?
>>>>
>>>
>>> That's your application. Web Console uses Ignite SQL APIs as well.
>>>
>>>
>>>> 5) What paralalism settings would your recomend, we were thinking to
>>>> set parallelJobsNumber  to 1  and task parallelism to number of cores * 2 -
>>>> this way we can make sure that each job gets al the heap memory instead of
>>>> all jobs fighting each other. Not sure if it makes sense, and it will also
>>>> prevent us from making real time transactional transactional queries.(we
>>>> are hoping to use ignite for both olap and simple real time queries)
>>>
>>>
>>> I would start a separate discussion for this bringing this question to
>>> the attention of our SQL experts. I'm not the one of them.
>>>
>>> --
>>> Denis
>>>
>>> On Mon, Aug 27, 2018 at 8:54 PM eugene miretsky <
>>> eugene.miretsky@gmail.com> wrote:
>>>
>>>> Denis, thanks for the detailed response.
>>>>
>>>> A few more follow up questions
>>>> 1) Are indexs loaded into heap (when used)?
>>>> 2) Are full pages loaded into heap, or only the matching records?
>>>> 3) When the query needs more processing than the exisiting index
>>>> (non-indexed columns, groupBy, aggreag) where/how does it happen?
>>>> 4) How is the query coordinator chosen? Is it the client node? How
>>>> about when using the web console?
>>>> 5) What paralalism settings would your recomend, we were thinking to
>>>> set parallelJobsNumber  to 1  and task parallelism to number of cores * 2 -
>>>> this way we can make sure that each job gets al the heap memory instead of
>>>> all jobs fighting each other. Not sure if it makes sense, and it will also
>>>> prevent us from making real time transactional transactional queries.(we
>>>> are hoping to use ignite for both olap and simple real time queries)
>>>>
>>>> Cheers,
>>>> Eugene
>>>>
>>>>
>>>> On Sat, Aug 25, 2018 at 3:25 AM Denis Magda <dm...@apache.org> wrote:
>>>>
>>>>> Hello Eugene,
>>>>>
>>>>> 1) In what format is data stored off heap?
>>>>>
>>>>>
>>>>> Data is always stored in the binary format let it be on-heap, off-heap
>>>>> or Ignite persistence.
>>>>> https://apacheignite.readme.io/docs/binary-marshaller
>>>>>
>>>>> 2) What happens when a SQL query is executed, in particular
>>>>>
>>>>>>
>>>>>>    - How is H2 used? How is data loaded in H2? What if some of the
>>>>>>    data is on disk?
>>>>>>
>>>>>> H2 is used to build execution plans for SELECTs. H2 calls Ignite's
>>>>> B+Tree based indexing implementation to see which indexes are set. All the
>>>>> data and indexes are always stored in Ignite (off-heap + disk).
>>>>>
>>>>>>
>>>>>>    - When is data loaded into heap, and how much? Is only the output
>>>>>>    of H2 loaded, or everything?
>>>>>>
>>>>>> Queries results are stored in Java heap temporarily. Once the result
>>>>> set is read by your application, it will be garbage collected.
>>>>>
>>>>>>
>>>>>>    - How is the reduce stage performed? Is it performed only on one
>>>>>>    node (hence that node needs to load all the data into memory)
>>>>>>
>>>>>> Correct, the final result set is reduced on a query coordinator -
>>>>> your application that executed a SELECT.
>>>>>
>>>>> 3) What happens when Ingite runs out of memory during execution? Is
>>>>>> data evictied to disk (if persistence is enabled)?
>>>>>
>>>>>
>>>>> I guess you mean what happens if a result set doesn't fit in RAM
>>>>> during the execution, right? If so, then OOM will occur. We're working on
>>>>> an improvement that will offload the result set to disk to avoid OOM for
>>>>> all the scenarious:
>>>>> https://issues.apache.org/jira/browse/IGNITE-7526
>>>>>
>>>>>
>>>>>
>>>>>> 4) Based on the code, it looks like I need to set my data region size
>>>>>> to at most 50% of available memory (to avoid the warning), this seems a bit
>>>>>> wastefull.
>>>>>
>>>>>
>>>>> There is no such a requirement. I know many deployments use cases when
>>>>> one data region is given 20% of RAM, the other is given 40% and everything
>>>>> else is persisted to disk.
>>>>>
>>>>> 5) Do you have any general advice on benchmarking the memory
>>>>>> requirpement? So far I have not been able to find a way to check how much
>>>>>> memory each table takes on and off heap, and how much memory each query
>>>>>> takes.
>>>>>
>>>>>
>>>>> We use Yardstick for performance benchmarking:
>>>>> https://apacheignite.readme.io/docs/perfomance-benchmarking
>>>>>
>>>>> --
>>>>> Denis
>>>>>
>>>>> On Fri, Aug 24, 2018 at 7:06 AM eugene miretsky <
>>>>> eugene.miretsky@gmail.com> wrote:
>>>>>
>>>>>> Thanks!
>>>>>>
>>>>>> I am trying to understand when and how data is moved from off-heap to
>>>>>> on heap, particularly when using SQL.  I took a look at the wiki
>>>>>> <https://cwiki.apache.org/confluence/display/IGNITE/Ignite+Durable+Memory+-+under+the+hood> but
>>>>>> still have a few questions
>>>>>>
>>>>>> My understanding is that data is always store off-heap
>>>>>>
>>>>>> 1) In what format is data stored off heap?
>>>>>> 2) What happens when a SQL query is executed, in particular
>>>>>>
>>>>>>    - How is H2 used? How is data loaded in H2? What if some of the
>>>>>>    data is on disk?
>>>>>>    - When is data loaded into heap, and how much? Is only the output
>>>>>>    of H2 loaded, or everything?
>>>>>>    - How is the reduce stage performed? Is it performed only on one
>>>>>>    node (hence that node needs to load all the data into memory)
>>>>>>
>>>>>> 3) What happens when Ingite runs out of memory during execution? Is
>>>>>> data evictied to disk (if persistence is enabled)?
>>>>>> 4) Based on the code, it looks like I need to set my data region size
>>>>>> to at most 50% of available memory (to avoid the warning), this seems a bit
>>>>>> wastefull.
>>>>>> 5) Do you have any general advice on benchmarking the memory
>>>>>> requirpement? So far I have not been able to find a way to check how much
>>>>>> memory each table takes on and off heap, and how much memory each query
>>>>>> takes.
>>>>>>
>>>>>> Cheers,
>>>>>> Eugene
>>>>>>
>>>>>> On Fri, Aug 24, 2018 at 8:06 AM, NSAmelchev <ns...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Eugene,
>>>>>>>
>>>>>>> Yes, it's a misprint as Dmitry wrote.
>>>>>>>
>>>>>>> Ignite print this warning if nodes on local machine require more
>>>>>>> than 80% of
>>>>>>> physical RAM.
>>>>>>>
>>>>>>> From code, you can see that total heap/offheap memory summing
>>>>>>> from nodes having the same mac address. This way calculates total
>>>>>>> memory
>>>>>>> used
>>>>>>> by the local machine.
>>>>>>>
>>>>>>> --
>>>>>>> Best wishes,
>>>>>>> Amelchev Nikita
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>>>>>>>
>>>>>>
>>>>>>

Re: How much heap to allocate

Posted by Vladimir Ozerov <vo...@gridgain.com>.

Hi Eugene,

Answering your questions:
1) Grouping is performed on both mapper and reducer (coordinator). If you
group be affinity key, you may try setting "SqlFieldsQuery.colocated=true"
to bypass grouping on reducer
2) For this specific query H2 will store (customer_id, count(*),
sum(views)) for every customer_id. It is hard to guess how much space it
would take in heap, but I think it would be ~50-100 bytes per customer_id.
So if you have N customers, it would be (100 * N) bytes
3) Please see
https://apacheignite-sql.readme.io/docs/performance-and-debugging

Vladimir.

On Thu, Aug 30, 2018 at 5:57 PM eugene miretsky <eu...@gmail.com>
wrote:

> Thanks against for the detailed response!
>
> Our main use case is preforming large SQL queries over tables with 200M+
> rows  - wanted to give you a bit more details and context you can pass along
>
> A simple example would be:
>
>    - Table: customer_id, date, category, views, clicks ( pkey =
>    "customer_id, date", affinity key = date )
>    - Query: SELECT count(*) where date < X AND categroy in (C1, C2, C3)
>    GROUP BY customer_id HAVING SUM(views) > 20
>
> My main concernse are
> 1) How is the group by performed. You mentioend that it is performend on
> the coordinator, I was coping that singe we are grouping using an colomn
> that is an affintiy key, each node will be able to do it's own group by
> 2) How much heap should I allocate for the group by stage
> 3) General performance tips
>
> Cheers,
> Eugene
>
>
> On Thu, Aug 30, 2018 at 1:32 AM Denis Magda <dm...@apache.org> wrote:
>
>> Eugene,
>>
>> Just want to be sure you know about the existence of the following pages
>> which elaborate on Ignite memory architecture in details:
>>
>>    -
>>    https://cwiki.apache.org/confluence/display/IGNITE/Ignite+Durable+Memory+-+under+the+hood#IgniteDurableMemory-underthehood-Entriesandpagesindurablememory
>>    -
>>    https://cwiki.apache.org/confluence/display/IGNITE/Ignite+Persistent+Store+-+under+the+hood
>>
>>
>>
>>> 1) Are indexs loaded into heap (when used)?
>>>
>>
>> Something might be copied to disk but in most of the cases we perform
>> comparisons and other operations directly off-heap.
>> See org.apache.ignite.internal.processors.query.h2.database.InlineIndexHelper
>> and related classes.
>>
>> 2) Are full pages loaded into heap, or only the matching records?
>>>
>>
>> Matching records (result set) are presently loaded. The pages are not.
>>
>>
>>> 3) When the query needs more processing than the exisiting index
>>> (non-indexed columns, groupBy, aggreag) where/how does it happen?
>>>
>>
>> We will be doing a full scan. Grouping and aggregations are finalized on
>> the query coordinator which needs to get a full result set.
>>
>> 4) How is the query coordinator chosen? Is it the client node? How about
>>> when using the web console?
>>>
>>
>> That's your application. Web Console uses Ignite SQL APIs as well.
>>
>>
>>> 5) What paralalism settings would your recomend, we were thinking to set
>>> parallelJobsNumber  to 1  and task parallelism to number of cores * 2 -
>>> this way we can make sure that each job gets al the heap memory instead of
>>> all jobs fighting each other. Not sure if it makes sense, and it will also
>>> prevent us from making real time transactional transactional queries.(we
>>> are hoping to use ignite for both olap and simple real time queries)
>>
>>
>> I would start a separate discussion for this bringing this question to
>> the attention of our SQL experts. I'm not the one of them.
>>
>> --
>> Denis
>>
>> On Mon, Aug 27, 2018 at 8:54 PM eugene miretsky <
>> eugene.miretsky@gmail.com> wrote:
>>
>>> Denis, thanks for the detailed response.
>>>
>>> A few more follow up questions
>>> 1) Are indexs loaded into heap (when used)?
>>> 2) Are full pages loaded into heap, or only the matching records?
>>> 3) When the query needs more processing than the exisiting index
>>> (non-indexed columns, groupBy, aggreag) where/how does it happen?
>>> 4) How is the query coordinator chosen? Is it the client node? How about
>>> when using the web console?
>>> 5) What paralalism settings would your recomend, we were thinking to set
>>> parallelJobsNumber  to 1  and task parallelism to number of cores * 2 -
>>> this way we can make sure that each job gets al the heap memory instead of
>>> all jobs fighting each other. Not sure if it makes sense, and it will also
>>> prevent us from making real time transactional transactional queries.(we
>>> are hoping to use ignite for both olap and simple real time queries)
>>>
>>> Cheers,
>>> Eugene
>>>
>>>
>>> On Sat, Aug 25, 2018 at 3:25 AM Denis Magda <dm...@apache.org> wrote:
>>>
>>>> Hello Eugene,
>>>>
>>>> 1) In what format is data stored off heap?
>>>>
>>>>
>>>> Data is always stored in the binary format let it be on-heap, off-heap
>>>> or Ignite persistence.
>>>> https://apacheignite.readme.io/docs/binary-marshaller
>>>>
>>>> 2) What happens when a SQL query is executed, in particular
>>>>
>>>>>
>>>>>    - How is H2 used? How is data loaded in H2? What if some of the
>>>>>    data is on disk?
>>>>>
>>>>> H2 is used to build execution plans for SELECTs. H2 calls Ignite's
>>>> B+Tree based indexing implementation to see which indexes are set. All the
>>>> data and indexes are always stored in Ignite (off-heap + disk).
>>>>
>>>>>
>>>>>    - When is data loaded into heap, and how much? Is only the output
>>>>>    of H2 loaded, or everything?
>>>>>
>>>>> Queries results are stored in Java heap temporarily. Once the result
>>>> set is read by your application, it will be garbage collected.
>>>>
>>>>>
>>>>>    - How is the reduce stage performed? Is it performed only on one
>>>>>    node (hence that node needs to load all the data into memory)
>>>>>
>>>>> Correct, the final result set is reduced on a query coordinator - your
>>>> application that executed a SELECT.
>>>>
>>>> 3) What happens when Ingite runs out of memory during execution? Is
>>>>> data evictied to disk (if persistence is enabled)?
>>>>
>>>>
>>>> I guess you mean what happens if a result set doesn't fit in RAM during
>>>> the execution, right? If so, then OOM will occur. We're working on an
>>>> improvement that will offload the result set to disk to avoid OOM for all
>>>> the scenarious:
>>>> https://issues.apache.org/jira/browse/IGNITE-7526
>>>>
>>>>
>>>>
>>>>> 4) Based on the code, it looks like I need to set my data region size
>>>>> to at most 50% of available memory (to avoid the warning), this seems a bit
>>>>> wastefull.
>>>>
>>>>
>>>> There is no such a requirement. I know many deployments use cases when
>>>> one data region is given 20% of RAM, the other is given 40% and everything
>>>> else is persisted to disk.
>>>>
>>>> 5) Do you have any general advice on benchmarking the memory
>>>>> requirpement? So far I have not been able to find a way to check how much
>>>>> memory each table takes on and off heap, and how much memory each query
>>>>> takes.
>>>>
>>>>
>>>> We use Yardstick for performance benchmarking:
>>>> https://apacheignite.readme.io/docs/perfomance-benchmarking
>>>>
>>>> --
>>>> Denis
>>>>
>>>> On Fri, Aug 24, 2018 at 7:06 AM eugene miretsky <
>>>> eugene.miretsky@gmail.com> wrote:
>>>>
>>>>> Thanks!
>>>>>
>>>>> I am trying to understand when and how data is moved from off-heap to
>>>>> on heap, particularly when using SQL.  I took a look at the wiki
>>>>> <https://cwiki.apache.org/confluence/display/IGNITE/Ignite+Durable+Memory+-+under+the+hood> but
>>>>> still have a few questions
>>>>>
>>>>> My understanding is that data is always store off-heap
>>>>>
>>>>> 1) In what format is data stored off heap?
>>>>> 2) What happens when a SQL query is executed, in particular
>>>>>
>>>>>    - How is H2 used? How is data loaded in H2? What if some of the
>>>>>    data is on disk?
>>>>>    - When is data loaded into heap, and how much? Is only the output
>>>>>    of H2 loaded, or everything?
>>>>>    - How is the reduce stage performed? Is it performed only on one
>>>>>    node (hence that node needs to load all the data into memory)
>>>>>
>>>>> 3) What happens when Ingite runs out of memory during execution? Is
>>>>> data evictied to disk (if persistence is enabled)?
>>>>> 4) Based on the code, it looks like I need to set my data region size
>>>>> to at most 50% of available memory (to avoid the warning), this seems a bit
>>>>> wastefull.
>>>>> 5) Do you have any general advice on benchmarking the memory
>>>>> requirpement? So far I have not been able to find a way to check how much
>>>>> memory each table takes on and off heap, and how much memory each query
>>>>> takes.
>>>>>
>>>>> Cheers,
>>>>> Eugene
>>>>>
>>>>> On Fri, Aug 24, 2018 at 8:06 AM, NSAmelchev <ns...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi Eugene,
>>>>>>
>>>>>> Yes, it's a misprint as Dmitry wrote.
>>>>>>
>>>>>> Ignite print this warning if nodes on local machine require more than
>>>>>> 80% of
>>>>>> physical RAM.
>>>>>>
>>>>>> From code, you can see that total heap/offheap memory summing
>>>>>> from nodes having the same mac address. This way calculates total
>>>>>> memory
>>>>>> used
>>>>>> by the local machine.
>>>>>>
>>>>>> --
>>>>>> Best wishes,
>>>>>> Amelchev Nikita
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>>>>>>
>>>>>
>>>>>

Re: How much heap to allocate

Posted by eugene miretsky <eu...@gmail.com>.

Thanks against for the detailed response!

Our main use case is preforming large SQL queries over tables with 200M+
rows  - wanted to give you a bit more details and context you can pass along

A simple example would be:

   - Table: customer_id, date, category, views, clicks ( pkey =
   "customer_id, date", affinity key = date )
   - Query: SELECT count(*) where date < X AND categroy in (C1, C2, C3)
   GROUP BY customer_id HAVING SUM(views) > 20

My main concernse are
1) How is the group by performed. You mentioend that it is performend on
the coordinator, I was coping that singe we are grouping using an colomn
that is an affintiy key, each node will be able to do it's own group by
2) How much heap should I allocate for the group by stage
3) General performance tips

Cheers,
Eugene


On Thu, Aug 30, 2018 at 1:32 AM Denis Magda <dm...@apache.org> wrote:

> Eugene,
>
> Just want to be sure you know about the existence of the following pages
> which elaborate on Ignite memory architecture in details:
>
>    -
>    https://cwiki.apache.org/confluence/display/IGNITE/Ignite+Durable+Memory+-+under+the+hood#IgniteDurableMemory-underthehood-Entriesandpagesindurablememory
>    -
>    https://cwiki.apache.org/confluence/display/IGNITE/Ignite+Persistent+Store+-+under+the+hood
>
>
>
>> 1) Are indexs loaded into heap (when used)?
>>
>
> Something might be copied to disk but in most of the cases we perform
> comparisons and other operations directly off-heap.
> See org.apache.ignite.internal.processors.query.h2.database.InlineIndexHelper
> and related classes.
>
> 2) Are full pages loaded into heap, or only the matching records?
>>
>
> Matching records (result set) are presently loaded. The pages are not.
>
>
>> 3) When the query needs more processing than the exisiting index
>> (non-indexed columns, groupBy, aggreag) where/how does it happen?
>>
>
> We will be doing a full scan. Grouping and aggregations are finalized on
> the query coordinator which needs to get a full result set.
>
> 4) How is the query coordinator chosen? Is it the client node? How about
>> when using the web console?
>>
>
> That's your application. Web Console uses Ignite SQL APIs as well.
>
>
>> 5) What paralalism settings would your recomend, we were thinking to set
>> parallelJobsNumber  to 1  and task parallelism to number of cores * 2 -
>> this way we can make sure that each job gets al the heap memory instead of
>> all jobs fighting each other. Not sure if it makes sense, and it will also
>> prevent us from making real time transactional transactional queries.(we
>> are hoping to use ignite for both olap and simple real time queries)
>
>
> I would start a separate discussion for this bringing this question to the
> attention of our SQL experts. I'm not the one of them.
>
> --
> Denis
>
> On Mon, Aug 27, 2018 at 8:54 PM eugene miretsky <eu...@gmail.com>
> wrote:
>
>> Denis, thanks for the detailed response.
>>
>> A few more follow up questions
>> 1) Are indexs loaded into heap (when used)?
>> 2) Are full pages loaded into heap, or only the matching records?
>> 3) When the query needs more processing than the exisiting index
>> (non-indexed columns, groupBy, aggreag) where/how does it happen?
>> 4) How is the query coordinator chosen? Is it the client node? How about
>> when using the web console?
>> 5) What paralalism settings would your recomend, we were thinking to set
>> parallelJobsNumber  to 1  and task parallelism to number of cores * 2 -
>> this way we can make sure that each job gets al the heap memory instead of
>> all jobs fighting each other. Not sure if it makes sense, and it will also
>> prevent us from making real time transactional transactional queries.(we
>> are hoping to use ignite for both olap and simple real time queries)
>>
>> Cheers,
>> Eugene
>>
>>
>> On Sat, Aug 25, 2018 at 3:25 AM Denis Magda <dm...@apache.org> wrote:
>>
>>> Hello Eugene,
>>>
>>> 1) In what format is data stored off heap?
>>>
>>>
>>> Data is always stored in the binary format let it be on-heap, off-heap
>>> or Ignite persistence.
>>> https://apacheignite.readme.io/docs/binary-marshaller
>>>
>>> 2) What happens when a SQL query is executed, in particular
>>>
>>>>
>>>>    - How is H2 used? How is data loaded in H2? What if some of the
>>>>    data is on disk?
>>>>
>>>> H2 is used to build execution plans for SELECTs. H2 calls Ignite's
>>> B+Tree based indexing implementation to see which indexes are set. All the
>>> data and indexes are always stored in Ignite (off-heap + disk).
>>>
>>>>
>>>>    - When is data loaded into heap, and how much? Is only the output
>>>>    of H2 loaded, or everything?
>>>>
>>>> Queries results are stored in Java heap temporarily. Once the result
>>> set is read by your application, it will be garbage collected.
>>>
>>>>
>>>>    - How is the reduce stage performed? Is it performed only on one
>>>>    node (hence that node needs to load all the data into memory)
>>>>
>>>> Correct, the final result set is reduced on a query coordinator - your
>>> application that executed a SELECT.
>>>
>>> 3) What happens when Ingite runs out of memory during execution? Is data
>>>> evictied to disk (if persistence is enabled)?
>>>
>>>
>>> I guess you mean what happens if a result set doesn't fit in RAM during
>>> the execution, right? If so, then OOM will occur. We're working on an
>>> improvement that will offload the result set to disk to avoid OOM for all
>>> the scenarious:
>>> https://issues.apache.org/jira/browse/IGNITE-7526
>>>
>>>
>>>
>>>> 4) Based on the code, it looks like I need to set my data region size
>>>> to at most 50% of available memory (to avoid the warning), this seems a bit
>>>> wastefull.
>>>
>>>
>>> There is no such a requirement. I know many deployments use cases when
>>> one data region is given 20% of RAM, the other is given 40% and everything
>>> else is persisted to disk.
>>>
>>> 5) Do you have any general advice on benchmarking the memory
>>>> requirpement? So far I have not been able to find a way to check how much
>>>> memory each table takes on and off heap, and how much memory each query
>>>> takes.
>>>
>>>
>>> We use Yardstick for performance benchmarking:
>>> https://apacheignite.readme.io/docs/perfomance-benchmarking
>>>
>>> --
>>> Denis
>>>
>>> On Fri, Aug 24, 2018 at 7:06 AM eugene miretsky <
>>> eugene.miretsky@gmail.com> wrote:
>>>
>>>> Thanks!
>>>>
>>>> I am trying to understand when and how data is moved from off-heap to
>>>> on heap, particularly when using SQL.  I took a look at the wiki
>>>> <https://cwiki.apache.org/confluence/display/IGNITE/Ignite+Durable+Memory+-+under+the+hood> but
>>>> still have a few questions
>>>>
>>>> My understanding is that data is always store off-heap
>>>>
>>>> 1) In what format is data stored off heap?
>>>> 2) What happens when a SQL query is executed, in particular
>>>>
>>>>    - How is H2 used? How is data loaded in H2? What if some of the
>>>>    data is on disk?
>>>>    - When is data loaded into heap, and how much? Is only the output
>>>>    of H2 loaded, or everything?
>>>>    - How is the reduce stage performed? Is it performed only on one
>>>>    node (hence that node needs to load all the data into memory)
>>>>
>>>> 3) What happens when Ingite runs out of memory during execution? Is
>>>> data evictied to disk (if persistence is enabled)?
>>>> 4) Based on the code, it looks like I need to set my data region size
>>>> to at most 50% of available memory (to avoid the warning), this seems a bit
>>>> wastefull.
>>>> 5) Do you have any general advice on benchmarking the memory
>>>> requirpement? So far I have not been able to find a way to check how much
>>>> memory each table takes on and off heap, and how much memory each query
>>>> takes.
>>>>
>>>> Cheers,
>>>> Eugene
>>>>
>>>> On Fri, Aug 24, 2018 at 8:06 AM, NSAmelchev <ns...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi Eugene,
>>>>>
>>>>> Yes, it's a misprint as Dmitry wrote.
>>>>>
>>>>> Ignite print this warning if nodes on local machine require more than
>>>>> 80% of
>>>>> physical RAM.
>>>>>
>>>>> From code, you can see that total heap/offheap memory summing
>>>>> from nodes having the same mac address. This way calculates total
>>>>> memory
>>>>> used
>>>>> by the local machine.
>>>>>
>>>>> --
>>>>> Best wishes,
>>>>> Amelchev Nikita
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>>>>>
>>>>
>>>>

Re: How much heap to allocate

Posted by Denis Magda <dm...@apache.org>.

Eugene,

Just want to be sure you know about the existence of the following pages
which elaborate on Ignite memory architecture in details:

   -
   https://cwiki.apache.org/confluence/display/IGNITE/Ignite+Durable+Memory+-+under+the+hood#IgniteDurableMemory-underthehood-Entriesandpagesindurablememory
   -
   https://cwiki.apache.org/confluence/display/IGNITE/Ignite+Persistent+Store+-+under+the+hood



> 1) Are indexs loaded into heap (when used)?
>

Something might be copied to disk but in most of the cases we perform
comparisons and other operations directly off-heap.
See org.apache.ignite.internal.processors.query.h2.database.InlineIndexHelper
and related classes.

2) Are full pages loaded into heap, or only the matching records?
>

Matching records (result set) are presently loaded. The pages are not.


> 3) When the query needs more processing than the exisiting index
> (non-indexed columns, groupBy, aggreag) where/how does it happen?
>

We will be doing a full scan. Grouping and aggregations are finalized on
the query coordinator which needs to get a full result set.

4) How is the query coordinator chosen? Is it the client node? How about
> when using the web console?
>

That's your application. Web Console uses Ignite SQL APIs as well.


> 5) What paralalism settings would your recomend, we were thinking to set
> parallelJobsNumber  to 1  and task parallelism to number of cores * 2 -
> this way we can make sure that each job gets al the heap memory instead of
> all jobs fighting each other. Not sure if it makes sense, and it will also
> prevent us from making real time transactional transactional queries.(we
> are hoping to use ignite for both olap and simple real time queries)


I would start a separate discussion for this bringing this question to the
attention of our SQL experts. I'm not the one of them.

--
Denis

On Mon, Aug 27, 2018 at 8:54 PM eugene miretsky <eu...@gmail.com>
wrote:

> Denis, thanks for the detailed response.
>
> A few more follow up questions
> 1) Are indexs loaded into heap (when used)?
> 2) Are full pages loaded into heap, or only the matching records?
> 3) When the query needs more processing than the exisiting index
> (non-indexed columns, groupBy, aggreag) where/how does it happen?
> 4) How is the query coordinator chosen? Is it the client node? How about
> when using the web console?
> 5) What paralalism settings would your recomend, we were thinking to set
> parallelJobsNumber  to 1  and task parallelism to number of cores * 2 -
> this way we can make sure that each job gets al the heap memory instead of
> all jobs fighting each other. Not sure if it makes sense, and it will also
> prevent us from making real time transactional transactional queries.(we
> are hoping to use ignite for both olap and simple real time queries)
>
> Cheers,
> Eugene
>
>
> On Sat, Aug 25, 2018 at 3:25 AM Denis Magda <dm...@apache.org> wrote:
>
>> Hello Eugene,
>>
>> 1) In what format is data stored off heap?
>>
>>
>> Data is always stored in the binary format let it be on-heap, off-heap or
>> Ignite persistence.
>> https://apacheignite.readme.io/docs/binary-marshaller
>>
>> 2) What happens when a SQL query is executed, in particular
>>
>>>
>>>    - How is H2 used? How is data loaded in H2? What if some of the
>>>    data is on disk?
>>>
>>> H2 is used to build execution plans for SELECTs. H2 calls Ignite's
>> B+Tree based indexing implementation to see which indexes are set. All the
>> data and indexes are always stored in Ignite (off-heap + disk).
>>
>>>
>>>    - When is data loaded into heap, and how much? Is only the output of
>>>    H2 loaded, or everything?
>>>
>>> Queries results are stored in Java heap temporarily. Once the result set
>> is read by your application, it will be garbage collected.
>>
>>>
>>>    - How is the reduce stage performed? Is it performed only on one
>>>    node (hence that node needs to load all the data into memory)
>>>
>>> Correct, the final result set is reduced on a query coordinator - your
>> application that executed a SELECT.
>>
>> 3) What happens when Ingite runs out of memory during execution? Is data
>>> evictied to disk (if persistence is enabled)?
>>
>>
>> I guess you mean what happens if a result set doesn't fit in RAM during
>> the execution, right? If so, then OOM will occur. We're working on an
>> improvement that will offload the result set to disk to avoid OOM for all
>> the scenarious:
>> https://issues.apache.org/jira/browse/IGNITE-7526
>>
>>
>>
>>> 4) Based on the code, it looks like I need to set my data region size to
>>> at most 50% of available memory (to avoid the warning), this seems a bit
>>> wastefull.
>>
>>
>> There is no such a requirement. I know many deployments use cases when
>> one data region is given 20% of RAM, the other is given 40% and everything
>> else is persisted to disk.
>>
>> 5) Do you have any general advice on benchmarking the memory
>>> requirpement? So far I have not been able to find a way to check how much
>>> memory each table takes on and off heap, and how much memory each query
>>> takes.
>>
>>
>> We use Yardstick for performance benchmarking:
>> https://apacheignite.readme.io/docs/perfomance-benchmarking
>>
>> --
>> Denis
>>
>> On Fri, Aug 24, 2018 at 7:06 AM eugene miretsky <
>> eugene.miretsky@gmail.com> wrote:
>>
>>> Thanks!
>>>
>>> I am trying to understand when and how data is moved from off-heap to on
>>> heap, particularly when using SQL.  I took a look at the wiki
>>> <https://cwiki.apache.org/confluence/display/IGNITE/Ignite+Durable+Memory+-+under+the+hood> but
>>> still have a few questions
>>>
>>> My understanding is that data is always store off-heap
>>>
>>> 1) In what format is data stored off heap?
>>> 2) What happens when a SQL query is executed, in particular
>>>
>>>    - How is H2 used? How is data loaded in H2? What if some of the
>>>    data is on disk?
>>>    - When is data loaded into heap, and how much? Is only the output of
>>>    H2 loaded, or everything?
>>>    - How is the reduce stage performed? Is it performed only on one
>>>    node (hence that node needs to load all the data into memory)
>>>
>>> 3) What happens when Ingite runs out of memory during execution? Is data
>>> evictied to disk (if persistence is enabled)?
>>> 4) Based on the code, it looks like I need to set my data region size to
>>> at most 50% of available memory (to avoid the warning), this seems a bit
>>> wastefull.
>>> 5) Do you have any general advice on benchmarking the memory
>>> requirpement? So far I have not been able to find a way to check how much
>>> memory each table takes on and off heap, and how much memory each query
>>> takes.
>>>
>>> Cheers,
>>> Eugene
>>>
>>> On Fri, Aug 24, 2018 at 8:06 AM, NSAmelchev <ns...@gmail.com>
>>> wrote:
>>>
>>>> Hi Eugene,
>>>>
>>>> Yes, it's a misprint as Dmitry wrote.
>>>>
>>>> Ignite print this warning if nodes on local machine require more than
>>>> 80% of
>>>> physical RAM.
>>>>
>>>> From code, you can see that total heap/offheap memory summing
>>>> from nodes having the same mac address. This way calculates total memory
>>>> used
>>>> by the local machine.
>>>>
>>>> --
>>>> Best wishes,
>>>> Amelchev Nikita
>>>>
>>>>
>>>>
>>>> --
>>>> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>>>>
>>>
>>>

Re: How much heap to allocate

Posted by eugene miretsky <eu...@gmail.com>.

Denis, thanks for the detailed response.

A few more follow up questions
1) Are indexs loaded into heap (when used)?
2) Are full pages loaded into heap, or only the matching records?
3) When the query needs more processing than the exisiting index
(non-indexed columns, groupBy, aggreag) where/how does it happen?
4) How is the query coordinator chosen? Is it the client node? How about
when using the web console?
5) What paralalism settings would your recomend, we were thinking to set
parallelJobsNumber  to 1  and task parallelism to number of cores * 2 -
this way we can make sure that each job gets al the heap memory instead of
all jobs fighting each other. Not sure if it makes sense, and it will also
prevent us from making real time transactional transactional queries.(we
are hoping to use ignite for both olap and simple real time queries)

Cheers,
Eugene


On Sat, Aug 25, 2018 at 3:25 AM Denis Magda <dm...@apache.org> wrote:

> Hello Eugene,
>
> 1) In what format is data stored off heap?
>
>
> Data is always stored in the binary format let it be on-heap, off-heap or
> Ignite persistence.
> https://apacheignite.readme.io/docs/binary-marshaller
>
> 2) What happens when a SQL query is executed, in particular
>
>>
>>    - How is H2 used? How is data loaded in H2? What if some of the  data
>>    is on disk?
>>
>> H2 is used to build execution plans for SELECTs. H2 calls Ignite's B+Tree
> based indexing implementation to see which indexes are set. All the data
> and indexes are always stored in Ignite (off-heap + disk).
>
>>
>>    - When is data loaded into heap, and how much? Is only the output of
>>    H2 loaded, or everything?
>>
>> Queries results are stored in Java heap temporarily. Once the result set
> is read by your application, it will be garbage collected.
>
>>
>>    - How is the reduce stage performed? Is it performed only on one node
>>    (hence that node needs to load all the data into memory)
>>
>> Correct, the final result set is reduced on a query coordinator - your
> application that executed a SELECT.
>
> 3) What happens when Ingite runs out of memory during execution? Is data
>> evictied to disk (if persistence is enabled)?
>
>
> I guess you mean what happens if a result set doesn't fit in RAM during
> the execution, right? If so, then OOM will occur. We're working on an
> improvement that will offload the result set to disk to avoid OOM for all
> the scenarious:
> https://issues.apache.org/jira/browse/IGNITE-7526
>
>
>
>> 4) Based on the code, it looks like I need to set my data region size to
>> at most 50% of available memory (to avoid the warning), this seems a bit
>> wastefull.
>
>
> There is no such a requirement. I know many deployments use cases when one
> data region is given 20% of RAM, the other is given 40% and everything else
> is persisted to disk.
>
> 5) Do you have any general advice on benchmarking the memory requirpement?
>> So far I have not been able to find a way to check how much memory each
>> table takes on and off heap, and how much memory each query takes.
>
>
> We use Yardstick for performance benchmarking:
> https://apacheignite.readme.io/docs/perfomance-benchmarking
>
> --
> Denis
>
> On Fri, Aug 24, 2018 at 7:06 AM eugene miretsky <eu...@gmail.com>
> wrote:
>
>> Thanks!
>>
>> I am trying to understand when and how data is moved from off-heap to on
>> heap, particularly when using SQL.  I took a look at the wiki
>> <https://cwiki.apache.org/confluence/display/IGNITE/Ignite+Durable+Memory+-+under+the+hood> but
>> still have a few questions
>>
>> My understanding is that data is always store off-heap
>>
>> 1) In what format is data stored off heap?
>> 2) What happens when a SQL query is executed, in particular
>>
>>    - How is H2 used? How is data loaded in H2? What if some of the  data
>>    is on disk?
>>    - When is data loaded into heap, and how much? Is only the output of
>>    H2 loaded, or everything?
>>    - How is the reduce stage performed? Is it performed only on one node
>>    (hence that node needs to load all the data into memory)
>>
>> 3) What happens when Ingite runs out of memory during execution? Is data
>> evictied to disk (if persistence is enabled)?
>> 4) Based on the code, it looks like I need to set my data region size to
>> at most 50% of available memory (to avoid the warning), this seems a bit
>> wastefull.
>> 5) Do you have any general advice on benchmarking the memory
>> requirpement? So far I have not been able to find a way to check how much
>> memory each table takes on and off heap, and how much memory each query
>> takes.
>>
>> Cheers,
>> Eugene
>>
>> On Fri, Aug 24, 2018 at 8:06 AM, NSAmelchev <ns...@gmail.com> wrote:
>>
>>> Hi Eugene,
>>>
>>> Yes, it's a misprint as Dmitry wrote.
>>>
>>> Ignite print this warning if nodes on local machine require more than
>>> 80% of
>>> physical RAM.
>>>
>>> From code, you can see that total heap/offheap memory summing
>>> from nodes having the same mac address. This way calculates total memory
>>> used
>>> by the local machine.
>>>
>>> --
>>> Best wishes,
>>> Amelchev Nikita
>>>
>>>
>>>
>>> --
>>> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>>>
>>
>>

Re: How much heap to allocate

Posted by Denis Magda <dm...@apache.org>.

Hello Eugene,

1) In what format is data stored off heap?


Data is always stored in the binary format let it be on-heap, off-heap or
Ignite persistence.
https://apacheignite.readme.io/docs/binary-marshaller

2) What happens when a SQL query is executed, in particular

>
>    - How is H2 used? How is data loaded in H2? What if some of the  data
>    is on disk?
>
> H2 is used to build execution plans for SELECTs. H2 calls Ignite's B+Tree
based indexing implementation to see which indexes are set. All the data
and indexes are always stored in Ignite (off-heap + disk).

>
>    - When is data loaded into heap, and how much? Is only the output of
>    H2 loaded, or everything?
>
> Queries results are stored in Java heap temporarily. Once the result set
is read by your application, it will be garbage collected.

>
>    - How is the reduce stage performed? Is it performed only on one node
>    (hence that node needs to load all the data into memory)
>
> Correct, the final result set is reduced on a query coordinator - your
application that executed a SELECT.

3) What happens when Ingite runs out of memory during execution? Is data
> evictied to disk (if persistence is enabled)?


I guess you mean what happens if a result set doesn't fit in RAM during the
execution, right? If so, then OOM will occur. We're working on an
improvement that will offload the result set to disk to avoid OOM for all
the scenarious:
https://issues.apache.org/jira/browse/IGNITE-7526



> 4) Based on the code, it looks like I need to set my data region size to
> at most 50% of available memory (to avoid the warning), this seems a bit
> wastefull.


There is no such a requirement. I know many deployments use cases when one
data region is given 20% of RAM, the other is given 40% and everything else
is persisted to disk.

5) Do you have any general advice on benchmarking the memory requirpement?
> So far I have not been able to find a way to check how much memory each
> table takes on and off heap, and how much memory each query takes.


We use Yardstick for performance benchmarking:
https://apacheignite.readme.io/docs/perfomance-benchmarking

--
Denis

On Fri, Aug 24, 2018 at 7:06 AM eugene miretsky <eu...@gmail.com>
wrote:

> Thanks!
>
> I am trying to understand when and how data is moved from off-heap to on
> heap, particularly when using SQL.  I took a look at the wiki
> <https://cwiki.apache.org/confluence/display/IGNITE/Ignite+Durable+Memory+-+under+the+hood> but
> still have a few questions
>
> My understanding is that data is always store off-heap
>
> 1) In what format is data stored off heap?
> 2) What happens when a SQL query is executed, in particular
>
>    - How is H2 used? How is data loaded in H2? What if some of the  data
>    is on disk?
>    - When is data loaded into heap, and how much? Is only the output of
>    H2 loaded, or everything?
>    - How is the reduce stage performed? Is it performed only on one node
>    (hence that node needs to load all the data into memory)
>
> 3) What happens when Ingite runs out of memory during execution? Is data
> evictied to disk (if persistence is enabled)?
> 4) Based on the code, it looks like I need to set my data region size to
> at most 50% of available memory (to avoid the warning), this seems a bit
> wastefull.
> 5) Do you have any general advice on benchmarking the memory requirpement?
> So far I have not been able to find a way to check how much memory each
> table takes on and off heap, and how much memory each query takes.
>
> Cheers,
> Eugene
>
> On Fri, Aug 24, 2018 at 8:06 AM, NSAmelchev <ns...@gmail.com> wrote:
>
>> Hi Eugene,
>>
>> Yes, it's a misprint as Dmitry wrote.
>>
>> Ignite print this warning if nodes on local machine require more than 80%
>> of
>> physical RAM.
>>
>> From code, you can see that total heap/offheap memory summing
>> from nodes having the same mac address. This way calculates total memory
>> used
>> by the local machine.
>>
>> --
>> Best wishes,
>> Amelchev Nikita
>>
>>
>>
>> --
>> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>>
>
>

Re: How much heap to allocate

Posted by eugene miretsky <eu...@gmail.com>.

Thanks!

I am trying to understand when and how data is moved from off-heap to on
heap, particularly when using SQL.  I took a look at the wiki
<https://cwiki.apache.org/confluence/display/IGNITE/Ignite+Durable+Memory+-+under+the+hood>
but
still have a few questions

My understanding is that data is always store off-heap

1) In what format is data stored off heap?
2) What happens when a SQL query is executed, in particular

   - How is H2 used? How is data loaded in H2? What if some of the  data is
   on disk?
   - When is data loaded into heap, and how much? Is only the output of H2
   loaded, or everything?
   - How is the reduce stage performed? Is it performed only on one node
   (hence that node needs to load all the data into memory)

3) What happens when Ingite runs out of memory during execution? Is data
evictied to disk (if persistence is enabled)?
4) Based on the code, it looks like I need to set my data region size to at
most 50% of available memory (to avoid the warning), this seems a bit
wastefull.
5) Do you have any general advice on benchmarking the memory requirpement?
So far I have not been able to find a way to check how much memory each
table takes on and off heap, and how much memory each query takes.

Cheers,
Eugene

On Fri, Aug 24, 2018 at 8:06 AM, NSAmelchev <ns...@gmail.com> wrote:

> Hi Eugene,
>
> Yes, it's a misprint as Dmitry wrote.
>
> Ignite print this warning if nodes on local machine require more than 80%
> of
> physical RAM.
>
> From code, you can see that total heap/offheap memory summing
> from nodes having the same mac address. This way calculates total memory
> used
> by the local machine.
>
> --
> Best wishes,
> Amelchev Nikita
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>

Re: How much heap to allocate

Posted by NSAmelchev <ns...@gmail.com>.

Hi Eugene,

Yes, it's a misprint as Dmitry wrote.

Ignite print this warning if nodes on local machine require more than 80% of
physical RAM.

From code, you can see that total heap/offheap memory summing
from nodes having the same mac address. This way calculates total memory
used
by the local machine. 

-- 
Best wishes,
Amelchev Nikita



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: How much heap to allocate

Posted by Dmitriy Pavlov <dp...@gmail.com>.

Hi Eugene,

Misprint was corrected in https://issues.apache.org/jira/browse/IGNITE-7824 and
will be released as part of 2.7.

Even you don't have on-heap caches, entries during reading will be anyway
unmarshalled from off-heap to on-heap. So right choice of -Xmx has a high
dependence to particular application scenario.

Sincerely,
Dmitriy Pavlov

ср, 22 авг. 2018 г. в 21:30, eugene miretsky <eu...@gmail.com>:

> Hi,
>
> I am getting the following warning when starting Ignite - "
>
> Nodes started on local machine require more than 20% of physical RAM what
> can lead to significant slowdown due to swapping
> "
>
> The 20% is a typo in version 2.5, it should be 80%.
>
> We have increased the max size of the default region to 70% of the
> available memory on the instance (since that's the only region we use at
> the moment).
>
> From reading the code
> <https://github.com/apache/ignite/blob/0681d8725d096272815dbd20c078871b27449895/modules/core/src/main/java/org/apache/ignite/internal/IgniteKernal.java> that
> generates the error, it seems like
> 1) Ignite adds all the memory across all nodes to check if it is above the
> safeToUse threshold. I would expect the check to be done per node
> 2) totalOffheap seems to be the sum of the maxSizes of all regions, and
> totalHeap retrieved from the JVM configs. ingnite.sh sets  -Xmx200g.
>
> Assuming we are not enabling on-heap caching, what should we set the heap
> size to?
>
> Cheers,
> Eugene
>