You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hawq.apache.org by "Shubham Sharma (JIRA)" <ji...@apache.org> on 2017/04/14 17:57:41 UTC

[jira] [Comment Edited] (HAWQ-1433) ALTER RESOURCE QUEUE DDL does not check the format of attribute MEMORY_CLUSTER_LIMIT

    [ https://issues.apache.org/jira/browse/HAWQ-1433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15969208#comment-15969208 ] 

Shubham Sharma edited comment on HAWQ-1433 at 4/14/17 5:56 PM:
---------------------------------------------------------------

I did a bit of research on this and I think that the problem exists in resqueuemanager.c: updateResourceQueueAttributesInShadow(). In this function in case RSQ_TBL_ATTR_CORE_LIMIT_CLUSTER after checking whether the string is a percentage value or not (SimpleStringIsPercentage), which will fail as corelimit=90, it asserts false but the value of variable res remains unchanged(FUNC_RETURN_OK). Ideally it should exit out probably by setting  something like res=FUNC_RETURN_FAIL in the else block. Further down the stack since res is  FUNC_RETURN_OK all validations are successful.

{code}
case RSQ_TBL_ATTR_CORE_LIMIT_CLUSTER:
			if ( SimpleStringIsPercentage(attrvalue) )
			{
				percentage_change += 1;
				int8_t inputval = 0;
				res = SimpleStringToPercentage(attrvalue, &inputval);
				shadowqueinfo->ClusterVCorePer = inputval;

				if ( res == FUNC_RETURN_OK )
				{
					elog(RMLOG, "resource manager updated %s %lf.0%% in shadow "
								"of resource queue \'%s\'",
								RSQTBLAttrNames[RSQ_TBL_ATTR_CORE_LIMIT_CLUSTER],
								shadowqueinfo->ClusterVCorePer,
								queue->QueueInfo->Name);
				}
				shadowqueinfo->Status |= RESOURCE_QUEUE_STATUS_EXPRESS_PERCENT;
			}
			else
			{
				Assert(false);
			}
			break;
{code} 


was (Author: outofmemory):
I did a bit of research on this and I think that the problem exists in resqueuemanager.c: updateResourceQueueAttributesInShadow(). In this function in case RSQ_TBL_ATTR_CORE_LIMIT_CLUSTER after checking whether the string is a percentage value or not (SimpleStringIsPercentage), which will fail as corelimit=90, it asserts false but the value of variable res remains unchanged(FUNC_RETURN_OK). Ideally it should exit out probably with something like res=FUNC_RETURN_FAIL. Further down the stack since res is  FUNC_RETURN_OK all validations are successful.

{code}
case RSQ_TBL_ATTR_CORE_LIMIT_CLUSTER:
			if ( SimpleStringIsPercentage(attrvalue) )
			{
				percentage_change += 1;
				int8_t inputval = 0;
				res = SimpleStringToPercentage(attrvalue, &inputval);
				shadowqueinfo->ClusterVCorePer = inputval;

				if ( res == FUNC_RETURN_OK )
				{
					elog(RMLOG, "resource manager updated %s %lf.0%% in shadow "
								"of resource queue \'%s\'",
								RSQTBLAttrNames[RSQ_TBL_ATTR_CORE_LIMIT_CLUSTER],
								shadowqueinfo->ClusterVCorePer,
								queue->QueueInfo->Name);
				}
				shadowqueinfo->Status |= RESOURCE_QUEUE_STATUS_EXPRESS_PERCENT;
			}
			else
			{
				Assert(false);
			}
			break;
{code} 

> ALTER RESOURCE QUEUE DDL does not check the format of attribute MEMORY_CLUSTER_LIMIT
> ------------------------------------------------------------------------------------
>
>                 Key: HAWQ-1433
>                 URL: https://issues.apache.org/jira/browse/HAWQ-1433
>             Project: Apache HAWQ
>          Issue Type: Bug
>          Components: Resource Manager
>            Reporter: Yi Jin
>            Assignee: Yi Jin
>             Fix For: 2.3.0.0-incubating
>
>
> Shubham Sharma <to...@gmail.com>
> 2:11 PM (2 hours ago)
> to user, sebastiao.gone. 
> Hello Sebastio, I think you have encountered the following issue - 
> 1 - Problem -  alter resource queue pg_default with (CORE_LIMIT_CLUSTER/MEMORY_LIMIT_CLUSTER=90);
> gpadmin=# select * from pg_resqueue;
>   rsqname   | parentoid | activestats | memorylimit | corelimit | resovercommit | allocpolicy | vsegresourcequota | nvsegupperlimit | nvseglowerlimit | nvseg
> upperlimitperseg | nvseglowerlimitperseg | creationtime |          updatetime           | status 
> ------------+-----------+-------------+-------------+-----------+---------------+-------------+-------------------+-----------------+-----------------+------
> -----------------+-----------------------+--------------+-------------------------------+--------
>  pg_root    |         0 |          -1 | 100%        | 100%      |             2 | even        |                   |               0 |               0 |      
>                0 |                     0 |              |                               | branch
>  pg_default |      9800 |          20 | 50%         | 50%       |             2 | even        | mem:256mb         |               0 |               0 |      
>                0 |                     0 |              | 2017-04-12 22:45:55.056102+01 | 
> (2 rows)
> gpadmin=# alter resource queue pg_default with (CORE_LIMIT_CLUSTER=90);
> ALTER QUEUE
> gpadmin=# select * from test;
>  a 
> ---
> (0 rows)
> gpadmin=# \q
> 2 - restart hawq cluster
> 3 - ERROR
> [gpadmin@hdp3 ~]$ psql
> psql (8.2.15)
> Type "help" for help.
> gpadmin=# select * from test;
> WARNING:  FD 31 having errors raised. errno 104
> ERROR:  failed to register in resource manager, failed to receive content (pquery.c:787)
> 3 - alter resource queue pg_default with (CORE_LIMIT_CLUSTER/MEMORY_LIMIT_CLUSTER=50%); --Let's switch back
> ! Not allowed !
> alter resource queue pg_default with (CORE_LIMIT_CLUSTER=50%);
> WARNING:  FD 33 having errors raised. errno 104
> ERROR:  failed to register in resource manager, failed to receive content (resqueuecommand.c:364)
> 4 -  How to fix - Please be extra careful while using this.
> gpadmin=# begin;
> BEGIN
> gpadmin=# set allow_system_table_mods='dml';
> SET
> gpadmin=# select * from pg_resqueue where corelimit=90;
>   rsqname   | parentoid | activestats | memorylimit | corelimit | resovercommit | allocpolicy | vsegresourcequota | nvsegupperlimit | nvseglowerlimit | nvseg
> upperlimitperseg | nvseglowerlimitperseg | creationtime |          updatetime           | status 
> ------------+-----------+-------------+-------------+-----------+---------------+-------------+-------------------+-----------------+-----------------+------
> -----------------+-----------------------+--------------+-------------------------------+--------
>  pg_default |      9800 |          20 | 50%         | 90        |             2 | even        | mem:256mb         |               0 |               0 |      
>                0 |                     0 |              | 2017-04-12 22:59:30.092823+01 | 
> (1 row)
> gpadmin=# update pg_resqueue set corelimit='50%' where corelimit=90;
> UPDATE 1
> gpadmin=# commit;
> COMMIT
> 5 - System should be back to normal
> gpadmin=# select * from test;
>  a 
> ---
> (0 rows)
> Regards,
> Shubh



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)