You are viewing a plain text version of this content. The canonical link for it is here.
Posted to codereview@trafodion.apache.org by selvaganesang <gi...@git.apache.org> on 2018/03/12 23:59:01 UTC

[GitHub] trafodion pull request #1470: [TRAFODION-2853] memory leak of ComDiagsArea i...

GitHub user selvaganesang opened a pull request:

    https://github.com/apache/trafodion/pull/1470

    [TRAFODION-2853] memory leak of ComDiagsArea in CmpContext heap of mxosrvr

     The ComDiagsArea is allocated in many places and from different heaps in Trafodion, making it difficult to  detect the source of the leak. Hence, a different approach is taken to fix this issue.
        
    Currently, ComDiagsArea is allocated in many places unconditionally even when SQL statement completes execution without any error or warnings. Then it is deallocated. Changed this strategy and allocate ComDiagsArea only when there is an error or warning while compiling or executing the SQL statement.
        
    This should help the product in two ways
        1) To reduce the path length.  The smaller query execution would benefit the most by chopping of few more microseconds.
        2) Reduce the memory growth due to leaked ComDiagsArea


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/selvaganesang/trafodion arkcmp_issue

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/trafodion/pull/1470.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1470
    
----
commit b97982c4494e078c5de2d883442d86265f24dadc
Author: selvaganesang <se...@...>
Date:   2018-03-09T01:19:35Z

    [TRAFODION-2853] memory leak of ComDiagsArea in CmpContext heap of mxosrvr
    
    The ComDiagsArea is allocated in many places and from different heaps in Trafodion, making it difficult to
    detect the source of the leak. Hence, a different approach is taken to fix this issue.
    
    Currently, ComDiagsArea is allocated in many places unconditionally even when SQL statement completes
    execution without any error or warnings. Then it is deallocated. Changed this strategy and
    allocate ComDiagsArea only when there is an error or warning while compiling or executing the SQL statement.
    
    This should help the product in two ways
    1) To reduce the pathlength.  The smaller query execution would benefit the most by chopping of few more microseconds.
    2) Reduce the memory growth due to leaked ComDiagsArea

commit 07f41ddb3042ac039252bd09955fb59bb80c8f9a
Author: selvaganesang <se...@...>
Date:   2018-03-12T23:53:49Z

    [TRAFODION-2853] memory leak of ComDiagsArea in CmpContext heap of mxosrvr
    
    Fixes for the regression failures seen with b97982c4494e078c5de2d883442d86265f24dadc
    
    This includes the change to report the error at the time of compilation
    for invoke, showddl commands. Earlier errors were ignored during
    prepare time and reported only at the time of execute for these commands

----


---

[GitHub] trafodion pull request #1470: [TRAFODION-2853] memory leak of ComDiagsArea i...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/trafodion/pull/1470


---

[GitHub] trafodion pull request #1470: [TRAFODION-2853] memory leak of ComDiagsArea i...

Posted by selvaganesang <gi...@git.apache.org>.
Github user selvaganesang commented on a diff in the pull request:

    https://github.com/apache/trafodion/pull/1470#discussion_r174285545
  
    --- Diff: core/sql/executor/ex_root.cpp ---
    @@ -2659,33 +2663,21 @@ void ex_root_tcb::registerCB(ComDiagsArea *&diagsArea)
           return;
         }
       }
    -  NABoolean diagsAreaAllocated = FALSE;
    -
    -  if (diagsArea == NULL)
    -  {
    -     diagsAreaAllocated = TRUE;
    -     diagsArea = ComDiagsArea::allocate(getHeap());
    -  }
    +  Lng32 fromCond = 0;
    +  if (diagsArea != NULL)
    +      fromCond = diagsArea->mark();
       ExSsmpManager *ssmpManager = context->getSsmpManager();
    -  cbServer_ = ssmpManager->getSsmpServer(
    +  cbServer_ = ssmpManager->getSsmpServer((NAHeap *)getHeap(),
                                      cliGlobals->myNodeName(), 
                                      cliGlobals->myCpu(), diagsArea);
       if (cbServer_ == NULL || cbServer_->getControlConnection() == NULL)		
       {		
           // We could not get a phandle for the cancel broker.  However,		
           // let the query run (on the assumption that it will not need to 		
           // be canceled) and convert any error conditions to warnings.		
    -
    -      // tbd - figure a way retry registration later, as the query progresses.		
    -      if (diagsArea != NULL)		
    -         NegateAllErrors(diagsArea);		
    +      diagsArea->negateErrors(fromCond); 
    --- End diff --
    
    Yes.  The caller is responsible to deallocate it


---

[GitHub] trafodion pull request #1470: [TRAFODION-2853] memory leak of ComDiagsArea i...

Posted by selvaganesang <gi...@git.apache.org>.
Github user selvaganesang commented on a diff in the pull request:

    https://github.com/apache/trafodion/pull/1470#discussion_r174292065
  
    --- Diff: core/sql/sqlci/SqlCmd.cpp ---
    @@ -440,25 +461,50 @@ void handleLocalError(ComDiagsArea &diags, SqlciEnv *sqlci_env)
       // when HandleCLIError() is called with a error after a CLI call.
       // Soln :10-021203-3433
     
    -  if (diags.getNumber(DgSqlCode::ERROR_)) {
    +  if (diags->getNumber(DgSqlCode::ERROR_)) {
          worstcode = SQL_Error;
       }
    -  else if (diags.getNumber(DgSqlCode::WARNING_)) {
    +  else if (diags->getNumber(DgSqlCode::WARNING_)) {
         worstcode = SQL_Warning;
       }
     
       if (!lastLineWasABlank) log->WriteAllWithoutEOL("");
       lastLineWasABlank = TRUE;
     
       ostringstream errMsg;
    -  NADumpDiags(errMsg, &diags, TRUE/*newline*/, 0, NULL, log->isVerbose(),
    +  NADumpDiags(errMsg, diags, TRUE/*newline*/, 0, NULL, log->isVerbose(),
                   sqlci_env->getTerminalCharset());
     
       errMsg << ends;
     
       log->WriteAllWithoutEOL(errMsg.str().c_str());
     }
     
    +Int64 getRowsAffected(SQLSTMT_ID *stmt)
    +{
    +   Int32 rc;
    +   rc = SQL_EXEC_GetDiagnosticsStmtInfo2(stmt,
    +                           SQLDIAG_ROW_COUNT, &rowsAffected,
    +                           NULL, 0, NULL);
    +   if (rc == 0)
    +      return rowsAffected; 
    +   else
    +      return -1;
    +}
    +
    +Int64 getDiagsCondCount(SQLSTMT_ID *stmt)
    +{
    +   Int32 rc;
    +   Int64 diagsCondCount;
    +   rc = SQL_EXEC_GetDiagnosticsStmtInfo2(stmt,
    +                           SQLDIAG_NUMBER, &diagsCondCount,
    +                           NULL, 0, NULL);
    +   if (rc == 0)
    +      return 0; 
    +   else
    +      return diagsCondCount;
    --- End diff --
    
    I believe the return value is context()->diags()->mainSQLCode()  which would be 100 from the previous CLI call.   Yes. It is bizarre.  This change was done to get the warnings when some regressions failed.  


---

[GitHub] trafodion pull request #1470: [TRAFODION-2853] memory leak of ComDiagsArea i...

Posted by selvaganesang <gi...@git.apache.org>.
Github user selvaganesang commented on a diff in the pull request:

    https://github.com/apache/trafodion/pull/1470#discussion_r174679546
  
    --- Diff: core/sql/sqlci/SqlCmd.cpp ---
    @@ -440,25 +461,50 @@ void handleLocalError(ComDiagsArea &diags, SqlciEnv *sqlci_env)
       // when HandleCLIError() is called with a error after a CLI call.
       // Soln :10-021203-3433
     
    -  if (diags.getNumber(DgSqlCode::ERROR_)) {
    +  if (diags->getNumber(DgSqlCode::ERROR_)) {
          worstcode = SQL_Error;
       }
    -  else if (diags.getNumber(DgSqlCode::WARNING_)) {
    +  else if (diags->getNumber(DgSqlCode::WARNING_)) {
         worstcode = SQL_Warning;
       }
     
       if (!lastLineWasABlank) log->WriteAllWithoutEOL("");
       lastLineWasABlank = TRUE;
     
       ostringstream errMsg;
    -  NADumpDiags(errMsg, &diags, TRUE/*newline*/, 0, NULL, log->isVerbose(),
    +  NADumpDiags(errMsg, diags, TRUE/*newline*/, 0, NULL, log->isVerbose(),
                   sqlci_env->getTerminalCharset());
     
       errMsg << ends;
     
       log->WriteAllWithoutEOL(errMsg.str().c_str());
     }
     
    +Int64 getRowsAffected(SQLSTMT_ID *stmt)
    +{
    +   Int32 rc;
    +   rc = SQL_EXEC_GetDiagnosticsStmtInfo2(stmt,
    +                           SQLDIAG_ROW_COUNT, &rowsAffected,
    +                           NULL, 0, NULL);
    +   if (rc == 0)
    +      return rowsAffected; 
    +   else
    +      return -1;
    +}
    +
    +Int64 getDiagsCondCount(SQLSTMT_ID *stmt)
    +{
    +   Int32 rc;
    +   Int64 diagsCondCount;
    +   rc = SQL_EXEC_GetDiagnosticsStmtInfo2(stmt,
    +                           SQLDIAG_NUMBER, &diagsCondCount,
    +                           NULL, 0, NULL);
    +   if (rc == 0)
    +      return 0; 
    +   else
    +      return diagsCondCount;
    --- End diff --
    
    I just realized that I misread the function SQLCLI_ReturnCode.  It looks like diagsCondCount is uninitialized and hence junk value was returned for getDiagsCondCount that enabled us to return warnings. Basically two issues made this function to behave as expected most of the time. I will be fixing it soon


---

[GitHub] trafodion pull request #1470: [TRAFODION-2853] memory leak of ComDiagsArea i...

Posted by DaveBirdsall <gi...@git.apache.org>.
Github user DaveBirdsall commented on a diff in the pull request:

    https://github.com/apache/trafodion/pull/1470#discussion_r174277858
  
    --- Diff: core/sql/executor/ex_root.cpp ---
    @@ -2659,33 +2663,21 @@ void ex_root_tcb::registerCB(ComDiagsArea *&diagsArea)
           return;
         }
       }
    -  NABoolean diagsAreaAllocated = FALSE;
    -
    -  if (diagsArea == NULL)
    -  {
    -     diagsAreaAllocated = TRUE;
    -     diagsArea = ComDiagsArea::allocate(getHeap());
    -  }
    +  Lng32 fromCond = 0;
    +  if (diagsArea != NULL)
    +      fromCond = diagsArea->mark();
       ExSsmpManager *ssmpManager = context->getSsmpManager();
    -  cbServer_ = ssmpManager->getSsmpServer(
    +  cbServer_ = ssmpManager->getSsmpServer((NAHeap *)getHeap(),
                                      cliGlobals->myNodeName(), 
                                      cliGlobals->myCpu(), diagsArea);
       if (cbServer_ == NULL || cbServer_->getControlConnection() == NULL)		
       {		
           // We could not get a phandle for the cancel broker.  However,		
           // let the query run (on the assumption that it will not need to 		
           // be canceled) and convert any error conditions to warnings.		
    -
    -      // tbd - figure a way retry registration later, as the query progresses.		
    -      if (diagsArea != NULL)		
    -         NegateAllErrors(diagsArea);		
    +      diagsArea->negateErrors(fromCond); 
    --- End diff --
    
    The method's caller worries about deallocating the diagsArea in this code path, right?


---

[GitHub] trafodion pull request #1470: [TRAFODION-2853] memory leak of ComDiagsArea i...

Posted by DaveBirdsall <gi...@git.apache.org>.
Github user DaveBirdsall commented on a diff in the pull request:

    https://github.com/apache/trafodion/pull/1470#discussion_r174278397
  
    --- Diff: core/sql/export/ComDiags.h ---
    @@ -837,7 +837,7 @@ class ComDiagsArea : public IpcMessageObj {
     
       // These members provide set and get operations on the data
       // of a ComDiagsArea that is defined in ANSI table 21, in subclause
    -  // 18.1.  See also, ``Creating Errors Korrectly.''
    +  // 18.1.  See also, ``Creating Errors Correctly.''
    --- End diff --
    
    I think "Creating Errors Korrectly" is a reference to a paper Bill Rassieur wrote way back in HP days. (And yes he intentionally mis-spelled "Correctly".) I don't know if a copy is up on the Trafodion web site or not. Might be nice to put it there if not.


---

[GitHub] trafodion pull request #1470: [TRAFODION-2853] memory leak of ComDiagsArea i...

Posted by DaveBirdsall <gi...@git.apache.org>.
Github user DaveBirdsall commented on a diff in the pull request:

    https://github.com/apache/trafodion/pull/1470#discussion_r174279774
  
    --- Diff: core/sql/sqlci/SqlCmd.cpp ---
    @@ -440,25 +461,50 @@ void handleLocalError(ComDiagsArea &diags, SqlciEnv *sqlci_env)
       // when HandleCLIError() is called with a error after a CLI call.
       // Soln :10-021203-3433
     
    -  if (diags.getNumber(DgSqlCode::ERROR_)) {
    +  if (diags->getNumber(DgSqlCode::ERROR_)) {
          worstcode = SQL_Error;
       }
    -  else if (diags.getNumber(DgSqlCode::WARNING_)) {
    +  else if (diags->getNumber(DgSqlCode::WARNING_)) {
         worstcode = SQL_Warning;
       }
     
       if (!lastLineWasABlank) log->WriteAllWithoutEOL("");
       lastLineWasABlank = TRUE;
     
       ostringstream errMsg;
    -  NADumpDiags(errMsg, &diags, TRUE/*newline*/, 0, NULL, log->isVerbose(),
    +  NADumpDiags(errMsg, diags, TRUE/*newline*/, 0, NULL, log->isVerbose(),
                   sqlci_env->getTerminalCharset());
     
       errMsg << ends;
     
       log->WriteAllWithoutEOL(errMsg.str().c_str());
     }
     
    +Int64 getRowsAffected(SQLSTMT_ID *stmt)
    +{
    +   Int32 rc;
    +   rc = SQL_EXEC_GetDiagnosticsStmtInfo2(stmt,
    +                           SQLDIAG_ROW_COUNT, &rowsAffected,
    +                           NULL, 0, NULL);
    +   if (rc == 0)
    +      return rowsAffected; 
    +   else
    +      return -1;
    +}
    +
    +Int64 getDiagsCondCount(SQLSTMT_ID *stmt)
    +{
    +   Int32 rc;
    +   Int64 diagsCondCount;
    +   rc = SQL_EXEC_GetDiagnosticsStmtInfo2(stmt,
    +                           SQLDIAG_NUMBER, &diagsCondCount,
    +                           NULL, 0, NULL);
    +   if (rc == 0)
    +      return 0; 
    +   else
    +      return diagsCondCount;
    --- End diff --
    
    This is bizzarre. So if SQL_EXEC_GetDiagnosticsStmtInfo2 returns an error, we return diagsCondCount?


---