You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hawq.apache.org by "刘奎恩(局外)" <ku...@alibaba-inc.com> on 2017/12/01 06:52:35 UTC

回复:Default LOG level for "data locality ratio:" events contribute to Hawq Master log bloat

It makes sense to adjust its LOG level to reduce log size on master.
1. It affects master performance if log file size is huge.Query on hawq_toolkit.hawq_log_master_concise is slower as time going.
https://issues.apache.org/jira/projects/HAWQ/issues/HAWQ-1550
2. It has little value when segments are seperated from storage nodes.If storage nodes and computing nodes are geographically distributed,it is common on Cloud service, and no distributed cache systems are available,then we have little interest to keep them in log in such cases.

data locality ratio
-------------——
Kuien Liu/奎恩
------------------------------------------------------------------发件人:Kyle Roberts <kr...@pivotal.io>发送时间:2017年12月1日(星期五) 05:41收件人:dev <de...@hawq.incubator.apache.org>主 题:Default LOG level for "data locality ratio:" events contribute to Hawq Master log bloat
Issue:


Default LOG level for "data locality ratio:" events contribute to Hawq
Master log bloat.  Should only be needed when investigating/tuning
performance.

Case Scenario:

*"data locality ratio:"* events can report multiple times and, depending
upon the query, many times.


Sample event:


2017-11-30 20:33:39.857251
GMT,"gpadmin","testdb",p280349,th-909919968,"[local]",,2017-11-30 20:06:35
GMT,1630750,con25419,cmd28,seg-1,,,,sx2,"LOG","00000","data locality ratio:
1.000; virtual segment number: 1; different host number: 1; virtual segment
number per host(avg/min/max): (1/1/1); segment size(avg/min/max): (1016.000
B/1016 B/1016 B); segment size with penalty(avg/min/max): (1016.000 B/1016
B/1016 B); continuity(avg/min/max): (1.000/1.000/1.000).",,,,,"SQL
statement ""select count(*)::float4 from public.test_tbl as Ta where Ta.id
is null""","analyze test_tbl;",0,,"cdbdatalocality.c",3372,



For example, Hawq master logs are seen with close to *20% of the file size*
is from only the *"data locality ratio:" *events.


Product Enhancement suggestion:


Because these are only needed for performance tuning (or for investigation):


In:


https://github.com/apache/incubator-hawq/blob/master/src/backend/cdb/cdbdatalocality.c#L3372


Could we change default LOG level to something like DEBUG1, etc?




- Kyle

Re: Default LOG level for "data locality ratio:" events contribute to Hawq Master log bloat

Posted by Shubham Sharma <ss...@pivotal.io>.
Congratulations on your first PR Kyle. Welcome to the community.

On Fri, Dec 1, 2017 at 5:53 PM, Kyle Roberts <kr...@pivotal.io> wrote:

> Created JIRA: https://issues.apache.org/jira/browse/HAWQ-1560
>
> Submitted pull request: https://github.com/apache/incubator-hawq/pull/1318
>
> - Kyle
>
>
>
> On Fri, Dec 1, 2017 at 4:47 PM, Kyle Roberts <kr...@pivotal.io> wrote:
>
> > Hi everyone,
> >
> > I will be creating a JIRA and submitting a pull request for this.
> >
> > Thanks,
> >
> > - Kyle
> >
> > On Thu, Nov 30, 2017 at 10:52 PM, 刘奎恩(局外) <ku...@alibaba-inc.com>
> > wrote:
> >
> >>
> >> It makes sense to adjust its LOG level to reduce log size on master.
> >> 1. It affects master performance if log file size is
> >> huge.Query on hawq_toolkit.hawq_log_master_concise is slower
> >>  as time going.
> >> https://issues.apache.org/jira/projects/HAWQ/issues/HAWQ-1550
> >> 2. It has little value when segments are seperated from storage nodes.If
> >> storage nodes and computing nodes are geographically distributed,it is
> >> common on Cloud service, and no distributed cache systems are
> >> available,then we have little interest to keep them in log in such
> cases.
> >>
> >> data locality ratio
> >> -------------——
> >> Kuien Liu/奎恩
> >> ------------------------------------------------------------
> ------发件人:Kyle
> >> Roberts <kr...@pivotal.io>发送时间:2017年12月1日(星期五) 05:41收件人:dev <
> >> dev@hawq.incubator.apache.org>主 题:Default LOG level for "data locality
> >> ratio:" events contribute to Hawq Master log bloat
> >> Issue:
> >>
> >>
> >> Default LOG level for "data locality ratio:" events contribute to Hawq
> >> Master log bloat.  Should only be needed when investigating/tuning
> >> performance.
> >>
> >> Case Scenario:
> >>
> >> *"data locality ratio:"* events can report multiple times and, depending
> >> upon the query, many times.
> >>
> >>
> >> Sample event:
> >>
> >>
> >> 2017-11-30 20:33:39.857251
> >> GMT,"gpadmin","testdb",p280349,th-909919968,"[local]",,2017-11-30 20
> >> :06:35
> >> GMT,1630750,con25419,cmd28,seg-1,,,,sx2,"LOG","00000","data
> >> locality ratio:
> >> 1.000; virtual segment number: 1; different host number: 1;
> >> virtual segment
> >> number per host(avg/min/max): (1/1/1); segment size(avg/min/
> >> max): (1016.000
> >> B/1016 B/1016 B); segment size with penalty(avg/min/max): (1
> >> 016.000 B/1016
> >> B/1016 B); continuity(avg/min/max): (1.000/1.000/1.000).",,,,,"SQL
> >> statement ""select count(*)::float4 from public.test_tbl as
> >> Ta where Ta.id
> >> is null""","analyze test_tbl;",0,,"cdbdatalocality.c",3372,
> >>
> >>
> >>
> >> For example, Hawq master logs are seen with close to *20% of
> >>  the file size*
> >> is from only the *"data locality ratio:" *events.
> >>
> >>
> >> Product Enhancement suggestion:
> >>
> >>
> >> Because these are only needed for performance tuning (or for
> >>  investigation):
> >>
> >>
> >> In:
> >>
> >>
> >> https://github.com/apache/incubator-hawq/blob/master/src/bac
> >> kend/cdb/cdbdatalocality.c#L3372
> >>
> >>
> >> Could we change default LOG level to something like DEBUG1, etc?
> >>
> >>
> >>
> >>
> >> - Kyle
> >>
> >
> >
> >
> > --
> > Kyle Roberts  |  Staff Customer Engineer  |  650-846-1667
> > <(650)%20846-1667>
> > Support.Pivotal.io
> > <http://www.google.com/url?q=http%3A%2F%2Fsupport.pivotal.
> io%2F&sa=D&sntz=1&usg=AFQjCNGDBr_XSKC18wot5h3OkKoZ84Vn7Q>
> >   |  Mon-Fri  9:00am to 5:00pm PST  |  1-877-477-2269 <(877)%20477-2269>
> > [image: support]
> > <https://www.google.com/url?q=https%3A%2F%2Fsupport.pivotal.
> io%2F&sa=D&sntz=1&usg=AFQjCNEvwKLjzu29inKwy4jJjKsboqGMCg>
> >  [image: twitter]
> > <https://www.google.com/url?q=https%3A%2F%2Ftwitter.com%
> 2Fpivotal&sa=D&sntz=1&usg=AFQjCNG1FcqkH5ghKsSG6UkdeUzjSuDSHg>
> >  [image: linkedin]
> > <https://www.google.com/url?q=https%3A%2F%2Fwww.linkedin.
> com%2Fcompany%2F3048967&sa=D&sntz=1&usg=AFQjCNHOQGYmDYIQz06S3-vAuqzf8bN8Yw
> >
> >  [image: facebook]
> > <https://www.google.com/url?q=https%3A%2F%2Fwww.facebook.
> com%2Fpivotalsoftware&sa=D&sntz=1&usg=AFQjCNFQnPFtec1Rp3lKf6MuY1jcbA8j2A>
> >  [image: google plus] <https://plus.google.com/+Pivotal> [image:
> youtube]
> > <https://www.youtube.com/playlist?list=PLAdzTan_
> eSPScpj2J50ErtzR9ANSzv3kl>
> >
>
>
>
> --
> Kyle Roberts  |  Staff Customer Engineer  |  650-846-1667
> Support.Pivotal.io
> <http://www.google.com/url?q=http%3A%2F%2Fsupport.pivotal.
> io%2F&sa=D&sntz=1&usg=AFQjCNGDBr_XSKC18wot5h3OkKoZ84Vn7Q>
>   |  Mon-Fri  9:00am to 5:00pm PST  |  1-877-477-2269
> [image: support]
> <https://www.google.com/url?q=https%3A%2F%2Fsupport.pivotal.
> io%2F&sa=D&sntz=1&usg=AFQjCNEvwKLjzu29inKwy4jJjKsboqGMCg>
>  [image: twitter]
> <https://www.google.com/url?q=https%3A%2F%2Ftwitter.com%
> 2Fpivotal&sa=D&sntz=1&usg=AFQjCNG1FcqkH5ghKsSG6UkdeUzjSuDSHg>
>  [image: linkedin]
> <https://www.google.com/url?q=https%3A%2F%2Fwww.linkedin.
> com%2Fcompany%2F3048967&sa=D&sntz=1&usg=AFQjCNHOQGYmDYIQz06S3-vAuqzf8bN8Yw
> >
>  [image: facebook]
> <https://www.google.com/url?q=https%3A%2F%2Fwww.facebook.
> com%2Fpivotalsoftware&sa=D&sntz=1&usg=AFQjCNFQnPFtec1Rp3lKf6MuY1jcbA8j2A>
>  [image: google plus] <https://plus.google.com/+Pivotal> [image: youtube]
> <https://www.youtube.com/playlist?list=PLAdzTan_eSPScpj2J50ErtzR9ANSzv3kl>
>



-- 
Regards,
Shubham Sharma
Staff Customer Engineer
Pivotal Global Support Services
ssharma@pivotal.io
Direct Tel: +1(510)-304-8201
Office Hours: Mon-Fri 9:00 am to 5:00 pm PDT
Out of Office Hours Contact +1 877-477-2269

Re: Default LOG level for "data locality ratio:" events contribute to Hawq Master log bloat

Posted by Kyle Roberts <kr...@pivotal.io>.
Created JIRA: https://issues.apache.org/jira/browse/HAWQ-1560

Submitted pull request: https://github.com/apache/incubator-hawq/pull/1318

- Kyle



On Fri, Dec 1, 2017 at 4:47 PM, Kyle Roberts <kr...@pivotal.io> wrote:

> Hi everyone,
>
> I will be creating a JIRA and submitting a pull request for this.
>
> Thanks,
>
> - Kyle
>
> On Thu, Nov 30, 2017 at 10:52 PM, 刘奎恩(局外) <ku...@alibaba-inc.com>
> wrote:
>
>>
>> It makes sense to adjust its LOG level to reduce log size on master.
>> 1. It affects master performance if log file size is
>> huge.Query on hawq_toolkit.hawq_log_master_concise is slower
>>  as time going.
>> https://issues.apache.org/jira/projects/HAWQ/issues/HAWQ-1550
>> 2. It has little value when segments are seperated from storage nodes.If
>> storage nodes and computing nodes are geographically distributed,it is
>> common on Cloud service, and no distributed cache systems are
>> available,then we have little interest to keep them in log in such cases.
>>
>> data locality ratio
>> -------------——
>> Kuien Liu/奎恩
>> ------------------------------------------------------------------发件人:Kyle
>> Roberts <kr...@pivotal.io>发送时间:2017年12月1日(星期五) 05:41收件人:dev <
>> dev@hawq.incubator.apache.org>主 题:Default LOG level for "data locality
>> ratio:" events contribute to Hawq Master log bloat
>> Issue:
>>
>>
>> Default LOG level for "data locality ratio:" events contribute to Hawq
>> Master log bloat.  Should only be needed when investigating/tuning
>> performance.
>>
>> Case Scenario:
>>
>> *"data locality ratio:"* events can report multiple times and, depending
>> upon the query, many times.
>>
>>
>> Sample event:
>>
>>
>> 2017-11-30 20:33:39.857251
>> GMT,"gpadmin","testdb",p280349,th-909919968,"[local]",,2017-11-30 20
>> :06:35
>> GMT,1630750,con25419,cmd28,seg-1,,,,sx2,"LOG","00000","data
>> locality ratio:
>> 1.000; virtual segment number: 1; different host number: 1;
>> virtual segment
>> number per host(avg/min/max): (1/1/1); segment size(avg/min/
>> max): (1016.000
>> B/1016 B/1016 B); segment size with penalty(avg/min/max): (1
>> 016.000 B/1016
>> B/1016 B); continuity(avg/min/max): (1.000/1.000/1.000).",,,,,"SQL
>> statement ""select count(*)::float4 from public.test_tbl as
>> Ta where Ta.id
>> is null""","analyze test_tbl;",0,,"cdbdatalocality.c",3372,
>>
>>
>>
>> For example, Hawq master logs are seen with close to *20% of
>>  the file size*
>> is from only the *"data locality ratio:" *events.
>>
>>
>> Product Enhancement suggestion:
>>
>>
>> Because these are only needed for performance tuning (or for
>>  investigation):
>>
>>
>> In:
>>
>>
>> https://github.com/apache/incubator-hawq/blob/master/src/bac
>> kend/cdb/cdbdatalocality.c#L3372
>>
>>
>> Could we change default LOG level to something like DEBUG1, etc?
>>
>>
>>
>>
>> - Kyle
>>
>
>
>
> --
> Kyle Roberts  |  Staff Customer Engineer  |  650-846-1667
> <(650)%20846-1667>
> Support.Pivotal.io
> <http://www.google.com/url?q=http%3A%2F%2Fsupport.pivotal.io%2F&sa=D&sntz=1&usg=AFQjCNGDBr_XSKC18wot5h3OkKoZ84Vn7Q>
>   |  Mon-Fri  9:00am to 5:00pm PST  |  1-877-477-2269 <(877)%20477-2269>
> [image: support]
> <https://www.google.com/url?q=https%3A%2F%2Fsupport.pivotal.io%2F&sa=D&sntz=1&usg=AFQjCNEvwKLjzu29inKwy4jJjKsboqGMCg>
>  [image: twitter]
> <https://www.google.com/url?q=https%3A%2F%2Ftwitter.com%2Fpivotal&sa=D&sntz=1&usg=AFQjCNG1FcqkH5ghKsSG6UkdeUzjSuDSHg>
>  [image: linkedin]
> <https://www.google.com/url?q=https%3A%2F%2Fwww.linkedin.com%2Fcompany%2F3048967&sa=D&sntz=1&usg=AFQjCNHOQGYmDYIQz06S3-vAuqzf8bN8Yw>
>  [image: facebook]
> <https://www.google.com/url?q=https%3A%2F%2Fwww.facebook.com%2Fpivotalsoftware&sa=D&sntz=1&usg=AFQjCNFQnPFtec1Rp3lKf6MuY1jcbA8j2A>
>  [image: google plus] <https://plus.google.com/+Pivotal> [image: youtube]
> <https://www.youtube.com/playlist?list=PLAdzTan_eSPScpj2J50ErtzR9ANSzv3kl>
>



-- 
Kyle Roberts  |  Staff Customer Engineer  |  650-846-1667
Support.Pivotal.io
<http://www.google.com/url?q=http%3A%2F%2Fsupport.pivotal.io%2F&sa=D&sntz=1&usg=AFQjCNGDBr_XSKC18wot5h3OkKoZ84Vn7Q>
  |  Mon-Fri  9:00am to 5:00pm PST  |  1-877-477-2269
[image: support]
<https://www.google.com/url?q=https%3A%2F%2Fsupport.pivotal.io%2F&sa=D&sntz=1&usg=AFQjCNEvwKLjzu29inKwy4jJjKsboqGMCg>
 [image: twitter]
<https://www.google.com/url?q=https%3A%2F%2Ftwitter.com%2Fpivotal&sa=D&sntz=1&usg=AFQjCNG1FcqkH5ghKsSG6UkdeUzjSuDSHg>
 [image: linkedin]
<https://www.google.com/url?q=https%3A%2F%2Fwww.linkedin.com%2Fcompany%2F3048967&sa=D&sntz=1&usg=AFQjCNHOQGYmDYIQz06S3-vAuqzf8bN8Yw>
 [image: facebook]
<https://www.google.com/url?q=https%3A%2F%2Fwww.facebook.com%2Fpivotalsoftware&sa=D&sntz=1&usg=AFQjCNFQnPFtec1Rp3lKf6MuY1jcbA8j2A>
 [image: google plus] <https://plus.google.com/+Pivotal> [image: youtube]
<https://www.youtube.com/playlist?list=PLAdzTan_eSPScpj2J50ErtzR9ANSzv3kl>

Re: Default LOG level for "data locality ratio:" events contribute to Hawq Master log bloat

Posted by Kyle Roberts <kr...@pivotal.io>.
Hi everyone,

I will be creating a JIRA and submitting a pull request for this.

Thanks,

- Kyle

On Thu, Nov 30, 2017 at 10:52 PM, 刘奎恩(局外) <ku...@alibaba-inc.com> wrote:

>
> It makes sense to adjust its LOG level to reduce log size on master.
> 1. It affects master performance if log file size is
> huge.Query on hawq_toolkit.hawq_log_master_concise is slower
>  as time going.
> https://issues.apache.org/jira/projects/HAWQ/issues/HAWQ-1550
> 2. It has little value when segments are seperated from storage nodes.If
> storage nodes and computing nodes are geographically distributed,it is
> common on Cloud service, and no distributed cache systems are
> available,then we have little interest to keep them in log in such cases.
>
> data locality ratio
> -------------——
> Kuien Liu/奎恩
> ------------------------------------------------------------------发件人:Kyle
> Roberts <kr...@pivotal.io>发送时间:2017年12月1日(星期五) 05:41收件人:dev <
> dev@hawq.incubator.apache.org>主 题:Default LOG level for "data locality
> ratio:" events contribute to Hawq Master log bloat
> Issue:
>
>
> Default LOG level for "data locality ratio:" events contribute to Hawq
> Master log bloat.  Should only be needed when investigating/tuning
> performance.
>
> Case Scenario:
>
> *"data locality ratio:"* events can report multiple times and, depending
> upon the query, many times.
>
>
> Sample event:
>
>
> 2017-11-30 20:33:39.857251
> GMT,"gpadmin","testdb",p280349,th-909919968,"[local]",,2017-11-30 20:06:35
> GMT,1630750,con25419,cmd28,seg-1,,,,sx2,"LOG","00000","data
> locality ratio:
> 1.000; virtual segment number: 1; different host number: 1;
> virtual segment
> number per host(avg/min/max): (1/1/1); segment size(avg/min/
> max): (1016.000
> B/1016 B/1016 B); segment size with penalty(avg/min/max): (1016.000 B/1016
> B/1016 B); continuity(avg/min/max): (1.000/1.000/1.000).",,,,,"SQL
> statement ""select count(*)::float4 from public.test_tbl as Ta where Ta.id
> is null""","analyze test_tbl;",0,,"cdbdatalocality.c",3372,
>
>
>
> For example, Hawq master logs are seen with close to *20% of
>  the file size*
> is from only the *"data locality ratio:" *events.
>
>
> Product Enhancement suggestion:
>
>
> Because these are only needed for performance tuning (or for
>  investigation):
>
>
> In:
>
>
> https://github.com/apache/incubator-hawq/blob/master/src/
> backend/cdb/cdbdatalocality.c#L3372
>
>
> Could we change default LOG level to something like DEBUG1, etc?
>
>
>
>
> - Kyle
>



-- 
Kyle Roberts  |  Staff Customer Engineer  |  650-846-1667 <(650)%20846-1667>

Support.Pivotal.io
<http://www.google.com/url?q=http%3A%2F%2Fsupport.pivotal.io%2F&sa=D&sntz=1&usg=AFQjCNGDBr_XSKC18wot5h3OkKoZ84Vn7Q>
  |  Mon-Fri  9:00am to 5:00pm PST  |  1-877-477-2269 <(877)%20477-2269>
[image: support]
<https://www.google.com/url?q=https%3A%2F%2Fsupport.pivotal.io%2F&sa=D&sntz=1&usg=AFQjCNEvwKLjzu29inKwy4jJjKsboqGMCg>
 [image: twitter]
<https://www.google.com/url?q=https%3A%2F%2Ftwitter.com%2Fpivotal&sa=D&sntz=1&usg=AFQjCNG1FcqkH5ghKsSG6UkdeUzjSuDSHg>
 [image: linkedin]
<https://www.google.com/url?q=https%3A%2F%2Fwww.linkedin.com%2Fcompany%2F3048967&sa=D&sntz=1&usg=AFQjCNHOQGYmDYIQz06S3-vAuqzf8bN8Yw>
 [image: facebook]
<https://www.google.com/url?q=https%3A%2F%2Fwww.facebook.com%2Fpivotalsoftware&sa=D&sntz=1&usg=AFQjCNFQnPFtec1Rp3lKf6MuY1jcbA8j2A>
 [image: google plus] <https://plus.google.com/+Pivotal> [image: youtube]
<https://www.youtube.com/playlist?list=PLAdzTan_eSPScpj2J50ErtzR9ANSzv3kl>