You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hawq.apache.org by jiny2 <gi...@git.apache.org> on 2015/11/09 01:11:59 UTC

[GitHub] incubator-hawq pull request: HAWQ-136. Uneven resource allocation ...

GitHub user jiny2 opened a pull request:

    https://github.com/apache/incubator-hawq/pull/83

    HAWQ-136. Uneven resource allocation request caused queuing queries u…

    This fix is to make resource manager always acquire GRM containers (YARN for example) to make HAWQ acquried GRM resource evenly dispatched in the whole cluster. This may cause more resource acquired from GRM(YARN etc.)
    
    If resource manager wants more resource, it always try to level up the minimum water level of the whole cluster HAWQ acquired resource.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/jiny2/incubator-hawq HAWQ-136

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-hawq/pull/83.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #83
    
----
commit e296d67911f38aa4f324353fdd3157352eba6e93
Author: Yi Jin <yj...@pivotal.io>
Date:   2015-11-09T00:09:43Z

    HAWQ-136. Uneven resource allocation request caused queuing queries unable to get even resource allocation

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request: HAWQ-136. Uneven resource allocation ...

Posted by jiny2 <gi...@git.apache.org>.
Github user jiny2 commented on the pull request:

    https://github.com/apache/incubator-hawq/pull/83#issuecomment-154890864
  
    There is also a fix due to this new implementation, unavailable segment is not considered when building resource request to libyarn/none resource broker.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request: HAWQ-136. Uneven resource allocation ...

Posted by jiny2 <gi...@git.apache.org>.
Github user jiny2 closed the pull request at:

    https://github.com/apache/incubator-hawq/pull/83


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request: HAWQ-136. Uneven resource allocation ...

Posted by zhangh43 <gi...@git.apache.org>.
Github user zhangh43 commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/83#discussion_r44236761
  
    --- Diff: src/backend/resourcemanager/resourcemanager.c ---
    @@ -2177,6 +2213,149 @@ int generateAllocRequestToBroker(void)
     	return res;
     }
     
    +void completeAllocRequestToBroker(int32_t 	 *reqmem,
    +								  int32_t 	 *reqcore,
    +								  List 		**preferred)
    +{
    +	/*
    +	 * Go through each segment to get minimum water level. The idea of completing
    +	 * the request is to keep pulling up the lowest water level in the cluster
    +	 * until equal or more GRM containers are requested.
    +	 */
    +	Assert(*reqmem % *reqcore == 0);
    +	uint32_t ratio = *reqmem / *reqcore;
    +
    +	/* Step 1. Get lowest water level and build up index. */
    +	List *ressegl = NULL;
    +	getAllPAIRRefIntoList(&(PRESPOOL->Segments), &ressegl);
    +	/* Index of each segment in current preferred host list. */
    +	PAIR *reqidx = rm_palloc0(PCONTEXT, sizeof(PAIR) * list_length(ressegl));
    +	int llevel = INT_MAX;
    +	int totalcount = 0;
    +	int index = 0;
    +	ListCell *cell = NULL;
    +	foreach(cell, ressegl)
    +	{
    +		reqidx[index] = NULL;
    +
    +		PAIR pair = (PAIR)lfirst(cell);
    +		SegResource segres = (SegResource)(pair->Value);
    +
    +		/*
    +		 * Resource manager skips this segment if
    +		 * 1) Not FTS available;
    +		 * 2) Not GRM available;
    +		 * 3) Having resource decrease pending.
    +		 */
    +		if (!IS_SEGSTAT_FTSAVAILABLE(segres->Stat) ||
    +			(DRMGlobalInstance->ImpType != NONE_HAWQ2 &&
    +			 !IS_SEGSTAT_GRMAVAILABLE(segres->Stat)) ||
    +			(segres->DecPending.MemoryMB > 0 && segres->DecPending.Core > 0))
    +		{
    +			index++;
    +			continue;
    +		}
    +
    +		int clevel = segres->ContainerSets[0] == NULL ?
    +					 0 :
    +					 list_length(segres->ContainerSets[0]->Containers) +
    +					 segres->IncPending.MemoryMB / ratio;
    +
    +		ListCell *pcell = NULL;
    +		foreach(pcell, *preferred)
    +		{
    +			PAIR existpair = (PAIR)lfirst(pcell);
    +			if ( pair->Value == segres )
    +			{
    +				reqidx[index] = existpair;
    +				totalcount += ((ResourceBundle)(reqidx[index]->Value))->MemoryMB /
    +							  ratio;
    +				break;
    +			}
    +		}
    +
    +
    +		int creqsize = reqidx[index] == NULL ?
    +					   0 :
    +					   ((ResourceBundle)(reqidx[index]->Value))->MemoryMB / ratio;
    +
    +		llevel = clevel+creqsize < llevel ? clevel+creqsize : llevel;
    +		index++;
    +	}
    +
    +	/* Step 2. Adjust request. */
    +	int32_t reqcoreleft = *reqcore - totalcount;
    +	while( reqcoreleft > 0 )
    +	{
    +		llevel++;
    +		index = 0;
    +		foreach(cell, ressegl)
    +		{
    +			PAIR pair = (PAIR)lfirst(cell);
    +			SegResource segres = (SegResource)(pair->Value);
    +
    +			/*
    +			 * Resource manager skips this segment if
    +			 * 1) Not FTS available;
    +			 * 2) Not GRM available;
    +			 * 3) Having resource decrease pending.
    +			 */
    +			if (!IS_SEGSTAT_FTSAVAILABLE(segres->Stat) ||
    +				(DRMGlobalInstance->ImpType != NONE_HAWQ2 &&
    +				 !IS_SEGSTAT_GRMAVAILABLE(segres->Stat)) ||
    +				(segres->DecPending.MemoryMB > 0 && segres->DecPending.Core > 0))
    +			{
    +				index++;
    +				continue;
    +			}
    +
    +			int clevel = segres->ContainerSets[0] == NULL ?
    +						 0 :
    +						 list_length(segres->ContainerSets[0]->Containers) +
    +						 segres->IncPending.MemoryMB / ratio;
    +
    +			int aclevel = reqidx[index] == NULL ?
    +						  clevel :
    +						  clevel + ((ResourceBundle)(reqidx[index]->Value))->MemoryMB / ratio;
    +
    +			if ( llevel > aclevel )
    +			{
    +
    +				if ( reqidx[index] == NULL )
    +				{
    +					reqidx[index] = rm_palloc0(PCONTEXT, sizeof(PAIRData));
    +					reqidx[index]->Key = segres;
    +					ResourceBundle resource = rm_palloc0(PCONTEXT,
    +														 sizeof(ResourceBundleData));
    +					resetResourceBundleData(resource, 0, 0, ratio);
    +					reqidx[index]->Value = resource;
    +					*preferred = lappend(*preferred, reqidx[index]);
    +				}
    +				addResourceBundleData((ResourceBundle)(reqidx[index]->Value),
    +									  (llevel-aclevel) * ratio,
    +									  llevel-aclevel);
    +				reqcoreleft -= llevel-aclevel;
    +
    +				elog(RMLOG, "Resource manager acquires %lf GRM containers on "
    +						    "host %s. Current level(having pending) %d, "
    +						    "expect level %d, acquired in current request %d.",
    +						    ((ResourceBundle)(reqidx[index]->Value))->Core,
    +						    GET_SEGRESOURCE_HOSTNAME(((SegResource)(reqidx[index]->Key))),
    +							clevel,
    +							llevel,
    +							aclevel-clevel);
    +			}
    +
    +			index++;
    +		}
    +	}
    +
    +	/* Adjust total mem and core request. */
    +	*reqcore -= reqcoreleft;
    --- End diff --
    
    +1
    and just wonder is reqcoreleft must be 0 or can be negative?
    and is there a circumstance that "resource manager skips this segment"(line2303) are always true for all the segments, which leads to dead loop?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-hawq pull request: HAWQ-136. Uneven resource allocation ...

Posted by jiny2 <gi...@git.apache.org>.
Github user jiny2 commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq/pull/83#discussion_r44236932
  
    --- Diff: src/backend/resourcemanager/resourcemanager.c ---
    @@ -2177,6 +2213,149 @@ int generateAllocRequestToBroker(void)
     	return res;
     }
     
    +void completeAllocRequestToBroker(int32_t 	 *reqmem,
    +								  int32_t 	 *reqcore,
    +								  List 		**preferred)
    +{
    +	/*
    +	 * Go through each segment to get minimum water level. The idea of completing
    +	 * the request is to keep pulling up the lowest water level in the cluster
    +	 * until equal or more GRM containers are requested.
    +	 */
    +	Assert(*reqmem % *reqcore == 0);
    +	uint32_t ratio = *reqmem / *reqcore;
    +
    +	/* Step 1. Get lowest water level and build up index. */
    +	List *ressegl = NULL;
    +	getAllPAIRRefIntoList(&(PRESPOOL->Segments), &ressegl);
    +	/* Index of each segment in current preferred host list. */
    +	PAIR *reqidx = rm_palloc0(PCONTEXT, sizeof(PAIR) * list_length(ressegl));
    +	int llevel = INT_MAX;
    +	int totalcount = 0;
    +	int index = 0;
    +	ListCell *cell = NULL;
    +	foreach(cell, ressegl)
    +	{
    +		reqidx[index] = NULL;
    +
    +		PAIR pair = (PAIR)lfirst(cell);
    +		SegResource segres = (SegResource)(pair->Value);
    +
    +		/*
    +		 * Resource manager skips this segment if
    +		 * 1) Not FTS available;
    +		 * 2) Not GRM available;
    +		 * 3) Having resource decrease pending.
    +		 */
    +		if (!IS_SEGSTAT_FTSAVAILABLE(segres->Stat) ||
    +			(DRMGlobalInstance->ImpType != NONE_HAWQ2 &&
    +			 !IS_SEGSTAT_GRMAVAILABLE(segres->Stat)) ||
    +			(segres->DecPending.MemoryMB > 0 && segres->DecPending.Core > 0))
    +		{
    +			index++;
    +			continue;
    +		}
    +
    +		int clevel = segres->ContainerSets[0] == NULL ?
    +					 0 :
    +					 list_length(segres->ContainerSets[0]->Containers) +
    +					 segres->IncPending.MemoryMB / ratio;
    +
    +		ListCell *pcell = NULL;
    +		foreach(pcell, *preferred)
    +		{
    +			PAIR existpair = (PAIR)lfirst(pcell);
    +			if ( pair->Value == segres )
    +			{
    +				reqidx[index] = existpair;
    +				totalcount += ((ResourceBundle)(reqidx[index]->Value))->MemoryMB /
    +							  ratio;
    +				break;
    +			}
    +		}
    +
    +
    +		int creqsize = reqidx[index] == NULL ?
    +					   0 :
    +					   ((ResourceBundle)(reqidx[index]->Value))->MemoryMB / ratio;
    +
    +		llevel = clevel+creqsize < llevel ? clevel+creqsize : llevel;
    +		index++;
    +	}
    +
    +	/* Step 2. Adjust request. */
    +	int32_t reqcoreleft = *reqcore - totalcount;
    +	while( reqcoreleft > 0 )
    +	{
    +		llevel++;
    +		index = 0;
    +		foreach(cell, ressegl)
    +		{
    +			PAIR pair = (PAIR)lfirst(cell);
    +			SegResource segres = (SegResource)(pair->Value);
    +
    +			/*
    +			 * Resource manager skips this segment if
    +			 * 1) Not FTS available;
    +			 * 2) Not GRM available;
    +			 * 3) Having resource decrease pending.
    +			 */
    +			if (!IS_SEGSTAT_FTSAVAILABLE(segres->Stat) ||
    +				(DRMGlobalInstance->ImpType != NONE_HAWQ2 &&
    +				 !IS_SEGSTAT_GRMAVAILABLE(segres->Stat)) ||
    +				(segres->DecPending.MemoryMB > 0 && segres->DecPending.Core > 0))
    +			{
    +				index++;
    +				continue;
    +			}
    +
    +			int clevel = segres->ContainerSets[0] == NULL ?
    +						 0 :
    +						 list_length(segres->ContainerSets[0]->Containers) +
    +						 segres->IncPending.MemoryMB / ratio;
    +
    +			int aclevel = reqidx[index] == NULL ?
    +						  clevel :
    +						  clevel + ((ResourceBundle)(reqidx[index]->Value))->MemoryMB / ratio;
    +
    +			if ( llevel > aclevel )
    +			{
    +
    +				if ( reqidx[index] == NULL )
    +				{
    +					reqidx[index] = rm_palloc0(PCONTEXT, sizeof(PAIRData));
    +					reqidx[index]->Key = segres;
    +					ResourceBundle resource = rm_palloc0(PCONTEXT,
    +														 sizeof(ResourceBundleData));
    +					resetResourceBundleData(resource, 0, 0, ratio);
    +					reqidx[index]->Value = resource;
    +					*preferred = lappend(*preferred, reqidx[index]);
    +				}
    +				addResourceBundleData((ResourceBundle)(reqidx[index]->Value),
    +									  (llevel-aclevel) * ratio,
    +									  llevel-aclevel);
    +				reqcoreleft -= llevel-aclevel;
    +
    +				elog(RMLOG, "Resource manager acquires %lf GRM containers on "
    +						    "host %s. Current level(having pending) %d, "
    +						    "expect level %d, acquired in current request %d.",
    +						    ((ResourceBundle)(reqidx[index]->Value))->Core,
    +						    GET_SEGRESOURCE_HOSTNAME(((SegResource)(reqidx[index]->Key))),
    +							clevel,
    +							llevel,
    +							aclevel-clevel);
    +			}
    +
    +			index++;
    +		}
    +	}
    +
    +	/* Adjust total mem and core request. */
    +	*reqcore -= reqcoreleft;
    --- End diff --
    
    Yes, it is possible to be negative, for example, 1 GRM container is required, but in order to level up the minimum water level, 3 segments have to have 3 more GRM containers added into the request, which should cause reqcoreleft -2.
    
    I will check infinite loop problem and fix it. Thank you!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---