You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@trafodion.apache.org by "Selvaganesan Govindarajan (JIRA)" <ji...@apache.org> on 2018/01/08 01:28:00 UTC
[jira] [Commented] (TRAFODION-2888) Streamline setjmp/longjmp concepts in Trafodion

    [ https://issues.apache.org/jira/browse/TRAFODION-2888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16315550#comment-16315550 ] 

Selvaganesan Govindarajan commented on TRAFODION-2888:
------------------------------------------------------

It is true that every effort should be  to ensure the process continue to run, but to retain the stability of the cluster, it should be ok to bring down a process or node.

I think out of memory management  condition (OOM) and memory allocation failure are entirely orthogonal. 

OOM memory condition can happen when there is a memory pressure or when there is RAM exhaustion. It could be due to
a)  There are too many processes in the system than the system can handle
a)	Some of the processes are building up the virtual memory due to memory leak.
b)	Ran out of swap space

Memory allocation failure rarely happen in 64 bit addressing scheme unless some process limit like PTE (page table entry) is reached either at the process or system level.  The process dump I have analyzed  had allocated huge amount of of memory out of which only 1.6 GB is from accounted SQL memory via Trafodion Heap management. 

OOM condition can lead to memory allocation failure, but it is too late because OOM killer would have kicked in and killed some process that would have made the node unusable anyway.

If longjmp/setjmp needs work with heap correctly, it needs to be associated with top level heap cli_globals::executorMemory (even for  the ESP process) because EsgynDB heap management is hierarchical.  The heap in the lower rank requests its parent heap to allocate a block if it can’t assign the memory from already allocated block. This continues till it reaches the top level Heap. In case of multi-threaded ESPs this heap is used from multiple threads by setting the heap to be thread safe. But setjmp/longjmp are thread-safe only when it is coded such that you don't setjmp from one thread and longjmp to its context from another thread. It is not possible to guarantee the thread-safeness for setjmp/longjmp  in a multi-threaded ESP.

NAMemory::setJmpBuf is supposed to assert when threadSafe is set to true. 

In legacy Trafodion code, all memory allocations in executor are from the Heap infrastructure. But it isn’t the case anymore. I have seen the Trafodion heap memory constitute less than 10-20% of the total virtual memory of the process. In some scenarios,  it could be much worse because of memory fragmentation as seen from the core dump. So, the memory allocation failure(if it happens) most likely to happen in other parts of the code.

So, it is imperative that the  memory growth/leak is managed in a pro-active manner in Trafodion processes. My suggestion would be to look for the memory pressure in the cluster or virtual memory growth in the process at some logical points. For eq, It is possible to prevent new queries in mxosrvr if the virtual memory of the mxosrvr process exceeds a certain value.  When the application needs to execute multiple statements simultaneously, this restriction would make sense. If there is only one user SQL statement active at any point of time, then the memory growth seen in the mxosrvr process most likely is due to memory leak.  Currently Trafodion code doesn't detect this memory leak and recover the mxosrvr from it before the next user SQL statement is submitted to it.  But it is possible to incorporate such self-healing concepts.

To ensure that the process continue to run, the setjmp/longjmp concepts are retained in the compiler for all cases other than the memory allocation failure(which shouldn’t happen at all).



> Streamline setjmp/longjmp concepts in Trafodion
> -----------------------------------------------
>
>                 Key: TRAFODION-2888
>                 URL: https://issues.apache.org/jira/browse/TRAFODION-2888
>             Project: Apache Trafodion
>          Issue Type: Improvement
>          Components: sql-general
>            Reporter: Selvaganesan Govindarajan
>            Assignee: Selvaganesan Govindarajan
>             Fix For: 2.3
>
>
> I happened to come across a core dump with longjmp in executor layer that brought down the node. Unfortunately, the core dump wasn’t useful to figure out what was the root cause for the longjmp.  Hence,
> a)	I wonder is there a way to figure out what caused longjmp from the core?
> b)	If no, why do longjmp? It might be better to let it dump naturally by accessing the invalid address or null pointer right at the point of failure. 
> Was longjmp put in place in legacy Trafodion code base to avoid node being brought down when the privilege code gets into segment violation?
> If a) is not possible, I would want to remove the remnants of setjmp and longjmp from the code to enable us to debug the issue better.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)