You are viewing a plain text version of this content. The canonical link for it is here.

Posted to codereview@trafodion.apache.org by CoderSong2015 <gi...@git.apache.org> on 2018/04/24 03:24:37 UTC

[GitHub] trafodion pull request #1537: [Trafodion-3041] Support watchdog query cache

GitHub user CoderSong2015 opened a pull request:

    https://github.com/apache/trafodion/pull/1537

    [Trafodion-3041] Support watchdog query cache

    To improvement performance when every query executed needs to be written into repository.
    
    The time when the queries cached will be published into repository is determined by cache time or number of queries. You can configure it in the file dcs/conf/dcs-site.xml.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/CoderSong2015/Apache-Trafodion master

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/trafodion/pull/1537.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1537
    
----
commit c9570e33d1ebbc5d92bc8868be3f53f8ad10f318
Author: Haolin.song <40...@...>
Date:   2018-04-24T03:11:07Z

    [Trafodion-3041] Support watchdog query cache to improvement performance when every query executed needs to be written into repository.
    The time when the queries cached will be published into repository is determined by cache time or number of queries. You can configure it in the file
    conf/dcs-site.xml.

----


---

[GitHub] trafodion pull request #1537: [TRAFODION-3041] Support watchdog query cache

Posted by CoderSong2015 <gi...@git.apache.org>.

Github user CoderSong2015 commented on a diff in the pull request:

    https://github.com/apache/trafodion/pull/1537#discussion_r183935686
  
    --- Diff: core/conn/odbc/src/odbc/nsksrvr/SrvrConnect.cpp ---
    @@ -530,11 +542,39 @@ static void* SessionWatchDog(void* arg)
                     okToGo = false;
                 }
             }
    -
    -
    -		while(!record_session_done && okToGo)
    -		{
    -			REPOS_STATS repos_stats = repos_queue.get_task();
    +        vector< vector<string> > query_list;
    +        vector<string> session_start;
    +        vector<string> statement_new_query;
    +        vector<string> session_stat_aggregation;
    +
    +        session_start.push_back("upsert into Trafodion.\"_REPOS_\".metric_session_table values");
    +        statement_new_query.push_back("insert into Trafodion.\"_REPOS_\".metric_query_table values");
    +        session_stat_aggregation.push_back("insert into Trafodion.\"_REPOS_\".metric_query_aggr_table values");
    +
    +        query_list.push_back(session_start);
    +        query_list.push_back(statement_new_query);
    +        query_list.push_back(session_stat_aggregation);
    +
    +        int query_limit = statisticsCacheSize;
    +        int time_limit = aggrInterval;
    +        //0:None 1:update 2:insert/upsert cache limit 3:achieve timeline
    +        int execute_flag = REPOS_EXECUTE_NONE;
    +        clock_t time_start = clock();
    +        clock_t time_end= clock();
    +
    +        REPOS_STATS repos_stats;
    +        while(!record_session_done && okToGo)
    +        {
    +            time_start = clock();
    +            while(repos_queue.isEmpty() && (((time_end = clock()) - time_start) / 1000000 < time_limit));
    --- End diff --
    
    Thanks.  I think you are right that spin loops will burn CPU needlessly. As for your second point, every mxosrvr will have only  one watchdog thread, and only this function will call repos_queue.get_task(). Besides, the whole watchdog function  will be locked at the beginning and unlocked at the end so that I think it is thread-safe and doesn't need to add lock there. Thanks again and look forward to your insight. @DaveBirdsall 


---

[GitHub] trafodion pull request #1537: [TRAFODION-3041] Support watchdog query cache

Posted by venkat1m <gi...@git.apache.org>.

Github user venkat1m commented on a diff in the pull request:

    https://github.com/apache/trafodion/pull/1537#discussion_r183937495
  
    --- Diff: core/conn/odbc/src/odbc/nsksrvr/SrvrConnect.cpp ---
    @@ -530,11 +542,39 @@ static void* SessionWatchDog(void* arg)
                     okToGo = false;
                 }
             }
    -
    -
    -		while(!record_session_done && okToGo)
    -		{
    -			REPOS_STATS repos_stats = repos_queue.get_task();
    +        vector< vector<string> > query_list;
    +        vector<string> session_start;
    +        vector<string> statement_new_query;
    +        vector<string> session_stat_aggregation;
    +
    +        session_start.push_back("upsert into Trafodion.\"_REPOS_\".metric_session_table values");
    +        statement_new_query.push_back("insert into Trafodion.\"_REPOS_\".metric_query_table values");
    +        session_stat_aggregation.push_back("insert into Trafodion.\"_REPOS_\".metric_query_aggr_table values");
    +
    +        query_list.push_back(session_start);
    +        query_list.push_back(statement_new_query);
    +        query_list.push_back(session_stat_aggregation);
    +
    +        int query_limit = statisticsCacheSize;
    +        int time_limit = aggrInterval;
    +        //0:None 1:update 2:insert/upsert cache limit 3:achieve timeline
    +        int execute_flag = REPOS_EXECUTE_NONE;
    +        clock_t time_start = clock();
    +        clock_t time_end= clock();
    +
    +        REPOS_STATS repos_stats;
    +        while(!record_session_done && okToGo)
    +        {
    +            time_start = clock();
    +            while(repos_queue.isEmpty() && (((time_end = clock()) - time_start) / 1000000 < time_limit));
    --- End diff --
    
    If you set the statisticsCacheSize to 1, would we preserve the current behavior where mxosrvr publishes the repository records when it is time. For example session start is published as soon as session starts and session end is updated as soon as session ends. Like wise we need aggregate stats every 1 minute. Will the existing behavior be retained by setting statisticsCacheSize to 1 ?


---

[GitHub] trafodion pull request #1537: [TRAFODION-3041] Support watchdog query cache

Posted by asfgit <gi...@git.apache.org>.

Github user asfgit closed the pull request at:

    https://github.com/apache/trafodion/pull/1537


---

[GitHub] trafodion pull request #1537: [TRAFODION-3041] Support watchdog query cache

Posted by CoderSong2015 <gi...@git.apache.org>.

Github user CoderSong2015 commented on a diff in the pull request:

    https://github.com/apache/trafodion/pull/1537#discussion_r183941529
  
    --- Diff: core/conn/odbc/src/odbc/nsksrvr/SrvrConnect.cpp ---
    @@ -530,11 +542,39 @@ static void* SessionWatchDog(void* arg)
                     okToGo = false;
                 }
             }
    -
    -
    -		while(!record_session_done && okToGo)
    -		{
    -			REPOS_STATS repos_stats = repos_queue.get_task();
    +        vector< vector<string> > query_list;
    +        vector<string> session_start;
    +        vector<string> statement_new_query;
    +        vector<string> session_stat_aggregation;
    +
    +        session_start.push_back("upsert into Trafodion.\"_REPOS_\".metric_session_table values");
    +        statement_new_query.push_back("insert into Trafodion.\"_REPOS_\".metric_query_table values");
    +        session_stat_aggregation.push_back("insert into Trafodion.\"_REPOS_\".metric_query_aggr_table values");
    +
    +        query_list.push_back(session_start);
    +        query_list.push_back(statement_new_query);
    +        query_list.push_back(session_stat_aggregation);
    +
    +        int query_limit = statisticsCacheSize;
    +        int time_limit = aggrInterval;
    +        //0:None 1:update 2:insert/upsert cache limit 3:achieve timeline
    +        int execute_flag = REPOS_EXECUTE_NONE;
    +        clock_t time_start = clock();
    +        clock_t time_end= clock();
    +
    +        REPOS_STATS repos_stats;
    +        while(!record_session_done && okToGo)
    +        {
    +            time_start = clock();
    +            while(repos_queue.isEmpty() && (((time_end = clock()) - time_start) / 1000000 < time_limit));
    --- End diff --
    
    Thanks. Of course not. The existing behavior is that as long as there is data in the queue， it will be published into repository immediately. statisticsCacheSize just determines the number of data cached in the queue. if it is set to 1, the data will be sent immediately once it goes into the queue, like  'The existing behavior'. @venkat1m 


---

[GitHub] trafodion pull request #1537: [TRAFODION-3041] Support watchdog query cache

Posted by DaveBirdsall <gi...@git.apache.org>.

Github user DaveBirdsall commented on a diff in the pull request:

    https://github.com/apache/trafodion/pull/1537#discussion_r183818934
  
    --- Diff: core/conn/odbc/src/odbc/nsksrvr/SrvrConnect.cpp ---
    @@ -530,11 +542,39 @@ static void* SessionWatchDog(void* arg)
                     okToGo = false;
                 }
             }
    -
    -
    -		while(!record_session_done && okToGo)
    -		{
    -			REPOS_STATS repos_stats = repos_queue.get_task();
    +        vector< vector<string> > query_list;
    +        vector<string> session_start;
    +        vector<string> statement_new_query;
    +        vector<string> session_stat_aggregation;
    +
    +        session_start.push_back("upsert into Trafodion.\"_REPOS_\".metric_session_table values");
    +        statement_new_query.push_back("insert into Trafodion.\"_REPOS_\".metric_query_table values");
    +        session_stat_aggregation.push_back("insert into Trafodion.\"_REPOS_\".metric_query_aggr_table values");
    +
    +        query_list.push_back(session_start);
    +        query_list.push_back(statement_new_query);
    +        query_list.push_back(session_stat_aggregation);
    +
    +        int query_limit = statisticsCacheSize;
    +        int time_limit = aggrInterval;
    +        //0:None 1:update 2:insert/upsert cache limit 3:achieve timeline
    +        int execute_flag = REPOS_EXECUTE_NONE;
    +        clock_t time_start = clock();
    +        clock_t time_end= clock();
    +
    +        REPOS_STATS repos_stats;
    +        while(!record_session_done && okToGo)
    +        {
    +            time_start = clock();
    +            while(repos_queue.isEmpty() && (((time_end = clock()) - time_start) / 1000000 < time_limit));
    --- End diff --
    
    I'm not sure I understand this logic. The "while" loop looks like it will spin until the time limit expires or repos_queue is non-empty. This raises two concerns: 1. Spin loops burn CPU needlessly. It would be better to put a sleep call inside the loop. 2. In the logic below, you check to see if the queue is non-empty and then dequeue something. This does not look thread-safe. Another thread might dequeue something after the repos_queue.isEmpty() call and before the repos_queue.get_task() call.


---