You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hawq.apache.org by "Kuien Liu (JIRA)" <ji...@apache.org> on 2017/11/14 03:15:00 UTC

[jira] [Created] (HAWQ-1550) Query on hawq_toolkit.hawq_log_master_concise is slower as time going

Kuien Liu created HAWQ-1550:
-------------------------------

             Summary: Query on hawq_toolkit.hawq_log_master_concise is slower as time going
                 Key: HAWQ-1550
                 URL: https://issues.apache.org/jira/browse/HAWQ-1550
             Project: Apache HAWQ
          Issue Type: Improvement
            Reporter: Kuien Liu
            Assignee: Radar Lei


As time going, log file size on master is expending linearly, query on hawq_toolkit.hawq_log_master_concise is slower and slower.

I have collected a set of performance data (on a daily-build machine) with following SQL:

{code:sql}
select count(*) from hawq_toolkit.hawq_log_master_concise;
{code}

||log size||tuples||time||
|5.0M|17381 rows|291.866 ms|
|10.0M|32650 rows| 522.552 ms|
|20.0M|5939 rows|938.230 ms|

That means:
1. if we wanna perform monitoring on hawq_log_master_concise within 1 second, to raise warning when ERROR, FATAL or PANIC, the log size must be constrained. 
2. And, we need a way to focus on latest HOT log rotation files.

Besides, as time going, log files are heavy for master node, and it is a little bit expensive for users to keep COLD logs in Cloud instances, they may choose public Log Service to handle it. 

We consider to add two GUCs (log_max_size, and log_max_age) to constrain log file size, and introduce a new view (e.g., hawq_log_master_concise_hot) on fresh log files. What do you think?

 Any comments and suggestions are welcome.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)