You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "T Meyarivan (JIRA)" <ji...@apache.org> on 2011/03/03 18:01:37 UTC
[jira] Updated: (HADOOP-7160) Configurable initial buffersize for
getGroupDetails()
[ https://issues.apache.org/jira/browse/HADOOP-7160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
T Meyarivan updated HADOOP-7160:
--------------------------------
Description:
{code:title=trunk/src/native/src/org/apache/hadoop/security/getGroup.c|borderStyle=solid}
int getGroupDetails(gid_t group, char **grpBuf) {
struct group * grp = NULL;
size_t currBufferSize = sysconf(_SC_GETGR_R_SIZE_MAX);
if (currBufferSize < 1024) {
currBufferSize = 1024;
}
*grpBuf = NULL;
char *buf = (char*)malloc(sizeof(char) * currBufferSize);
if (!buf) {
return ENOMEM;
}
int error;
for (;;) {
error = getgrgid_r(group, (struct group*)buf,
buf + sizeof(struct group),
currBufferSize - sizeof(struct group), &grp);
if(error != ERANGE) {
break;
}
free(buf);
currBufferSize *= 2;
buf = malloc(sizeof(char) * currBufferSize);
if(!buf) {
return ENOMEM;
}
...
{code}
For large groups, this implies at least 2 queries for the group (number of queries = math.ceil(math.log(response_size/1024, 2)))
In the case of a large cluster with central user/group databases (exposed via LDAP etc), this leads to unnecessary load on the central services. This can be alleviated to a large extent by changing the initial buffer size to a configurable parameter
--
was:
trunk/src/native/src/org/apache/hadoop/security/getGroup.c
"""
int getGroupDetails(gid_t group, char **grpBuf) {
struct group * grp = NULL;
size_t currBufferSize = sysconf(_SC_GETGR_R_SIZE_MAX);
if (currBufferSize < 1024) {
currBufferSize = 1024;
}
*grpBuf = NULL;
char *buf = (char*)malloc(sizeof(char) * currBufferSize);
if (!buf) {
return ENOMEM;
}
int error;
for (;;) {
error = getgrgid_r(group, (struct group*)buf,
buf + sizeof(struct group),
currBufferSize - sizeof(struct group), &grp);
if(error != ERANGE) {
break;
}
free(buf);
currBufferSize *= 2;
buf = malloc(sizeof(char) * currBufferSize);
if(!buf) {
return ENOMEM;
}
...
"""
For large groups, this implies at least 2 queries for the group (number of queries = math.ceil(math.log(response_size/1024, 2)))
In the case of a large cluster with central user/group databases (exposed via LDAP etc), this leads to unnecessary load on the central services. This can be alleviated to a large extent by changing the initial buffer size to a configurable parameter
--
> Configurable initial buffersize for getGroupDetails()
> -----------------------------------------------------
>
> Key: HADOOP-7160
> URL: https://issues.apache.org/jira/browse/HADOOP-7160
> Project: Hadoop Common
> Issue Type: Improvement
> Components: native, security
> Affects Versions: 0.22.0
> Reporter: T Meyarivan
>
> {code:title=trunk/src/native/src/org/apache/hadoop/security/getGroup.c|borderStyle=solid}
> int getGroupDetails(gid_t group, char **grpBuf) {
> struct group * grp = NULL;
> size_t currBufferSize = sysconf(_SC_GETGR_R_SIZE_MAX);
> if (currBufferSize < 1024) {
> currBufferSize = 1024;
> }
> *grpBuf = NULL;
> char *buf = (char*)malloc(sizeof(char) * currBufferSize);
> if (!buf) {
> return ENOMEM;
> }
> int error;
> for (;;) {
> error = getgrgid_r(group, (struct group*)buf,
> buf + sizeof(struct group),
> currBufferSize - sizeof(struct group), &grp);
> if(error != ERANGE) {
> break;
> }
> free(buf);
> currBufferSize *= 2;
> buf = malloc(sizeof(char) * currBufferSize);
> if(!buf) {
> return ENOMEM;
> }
> ...
> {code}
> For large groups, this implies at least 2 queries for the group (number of queries = math.ceil(math.log(response_size/1024, 2)))
> In the case of a large cluster with central user/group databases (exposed via LDAP etc), this leads to unnecessary load on the central services. This can be alleviated to a large extent by changing the initial buffer size to a configurable parameter
> --
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira