You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@trafodion.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2019/02/01 16:30:00 UTC

[jira] [Work logged] (TRAFODION-3260) SSMP may wait 3 seconds before handling requests

     [ https://issues.apache.org/jira/browse/TRAFODION-3260?focusedWorklogId=193447&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-193447 ]

ASF GitHub Bot logged work on TRAFODION-3260:
---------------------------------------------

                Author: ASF GitHub Bot
            Created on: 01/Feb/19 16:29
            Start Date: 01/Feb/19 16:29
    Worklog Time Spent: 10m 
      Work Description: selvaganesang commented on pull request #1786: [TRAFODION-3260] SSMP may wait 3 seconds before handling requests
URL: https://github.com/apache/trafodion/pull/1786#discussion_r253113311
 
 

 ##########
 File path: core/sql/bin/ex_ssmp_main.cpp
 ##########
 @@ -246,8 +247,12 @@ void runServer(Int32 argc, char **argv)
     }
   }
 */
-    // wait for system messages only until ssmp starts receiving msgs.
-    cc->wait(300);
+    // Wait for messages, but we need ssmp to wake up periodically to
+    // perform garbage collection.
+    short mask = XWAIT(LREQ | LDONE, ssmpGlobals->getStatsMergeTimeout());
+    if (mask & LREQ) {
+      cc->wait(0);
 
 Review comment:
   Yes. It looks like there is some problem with waitOnAll with a timeout value specified. The suggested change can also show the same symptom of waiting by SSMP, though it is not as severe as earlier.  So, we can go for one of the three options:
   1)  Merge this PR with no more changes.
   2)  A simple encapsulation to move XWAIT in ex_ssmp_main.cpp to the IPC layer and return a value to deal with completion on either ControlConnection or other client connections with wait(IpcImmediately) as it is done now in this PR.
   3) Make sure waitOnAll works well when timeout value is specified giving us a better encapsulation.
   
   I will let author @zhenxingh  to decide on these options
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 193447)
    Time Spent: 3h 40m  (was: 3.5h)

> SSMP may wait 3 seconds before handling requests
> ------------------------------------------------
>
>                 Key: TRAFODION-3260
>                 URL: https://issues.apache.org/jira/browse/TRAFODION-3260
>             Project: Apache Trafodion
>          Issue Type: Bug
>          Components: sql-exe
>    Affects Versions: any
>            Reporter: He Zhenxing
>            Priority: Major
>             Fix For: 2.4
>
>          Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> SSMP may wait and stop responding for up to 3 seconds before handling requests. This problem was found while investigating LOB locking issue, which may take 3 seconds to acquire the lock.
> Here are the steps to reproduce the issue:
>  
> {code:java}
> >> cqd traf_blob_as_varchar 'off';
> >> create table t1 (a blob);
> >> set statistics on;
> >> insert into t1 values (stringtolob('abc'));
> >> insert into t1 values (stringtolob('abc'));
> {code}
>  
> We ignore the first insert, which may take long for loading metadata. starting from the second, sometimes the insert will take 3 or 6 seconds to finish. Normally, it should only take hundreds of milliseconds.
> The problem is because SSMP waiting for client requests from $RECEIVE and replies from SSCP separately, so there is a possibility that SSMP is waiting on $RECEIVE while there are replies from SSCP and thus it will have to wait until timeout (3 seconds) before the replies can be handled and then SSMP can reply the client. If both LOB lock and unlock suffered this, the insert will take more than 6 seconds to finish.
> This problem also affects other scenarios that need to interact with SSMP.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)