You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-issues@hadoop.apache.org by "T Meyarivan (JIRA)" <ji...@apache.org> on 2011/03/03 17:56:37 UTC

[jira] Created: (HADOOP-7160) Configurable initial buffersize for getGroupDetails()

Configurable initial buffersize for getGroupDetails()
-----------------------------------------------------

                 Key: HADOOP-7160
                 URL: https://issues.apache.org/jira/browse/HADOOP-7160
             Project: Hadoop Common
          Issue Type: Improvement
          Components: native, security
    Affects Versions: 0.22.0
            Reporter: T Meyarivan


trunk/src/native/src/org/apache/hadoop/security/getGroup.c

"""
int getGroupDetails(gid_t group, char **grpBuf) {
  struct group * grp = NULL;
  size_t currBufferSize = sysconf(_SC_GETGR_R_SIZE_MAX);
  if (currBufferSize < 1024) {
    currBufferSize = 1024;
  }
  *grpBuf = NULL; 
  char *buf = (char*)malloc(sizeof(char) * currBufferSize);

  if (!buf) {
    return ENOMEM;
  }
  int error;
  for (;;) {
    error = getgrgid_r(group, (struct group*)buf,
                       buf + sizeof(struct group),
                       currBufferSize - sizeof(struct group), &grp);
    if(error != ERANGE) {
       break;
    }
    free(buf);
    currBufferSize *= 2;
    buf = malloc(sizeof(char) * currBufferSize);
    if(!buf) {
      return ENOMEM;
    }
...
"""

For large groups, this implies at least 2 queries for the group (number of queries = math.ceil(math.log(response_size/1024, 2)))

In the case of a large cluster with central user/group databases (exposed via LDAP etc), this leads to unnecessary load on the central services. This can be alleviated to a large extent by changing the initial buffer size to a configurable parameter

--

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HADOOP-7160) Configurable initial buffersize for getGroupDetails()

Posted by "T Meyarivan (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-7160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13002177#comment-13002177 ] 

T Meyarivan commented on HADOOP-7160:
-------------------------------------

nscd doesn't cache the results until the query "succeeds" => it takes N queries (the result is discarded N-1 times) 

Cold cache + large job is likely to trigger a flood of queries

--

> Configurable initial buffersize for getGroupDetails()
> -----------------------------------------------------
>
>                 Key: HADOOP-7160
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7160
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: native, security
>    Affects Versions: 0.22.0
>            Reporter: T Meyarivan
>
> {code:title=trunk/src/native/src/org/apache/hadoop/security/getGroup.c|borderStyle=solid}
> int getGroupDetails(gid_t group, char **grpBuf) {
>   struct group * grp = NULL;
>   size_t currBufferSize = sysconf(_SC_GETGR_R_SIZE_MAX);
>   if (currBufferSize < 1024) {
>     currBufferSize = 1024;
>   }
>   *grpBuf = NULL; 
>   char *buf = (char*)malloc(sizeof(char) * currBufferSize);
>   if (!buf) {
>     return ENOMEM;
>   }
>   int error;
>   for (;;) {
>     error = getgrgid_r(group, (struct group*)buf,
>                        buf + sizeof(struct group),
>                        currBufferSize - sizeof(struct group), &grp);
>     if(error != ERANGE) {
>        break;
>     }
>     free(buf);
>     currBufferSize *= 2;
>     buf = malloc(sizeof(char) * currBufferSize);
>     if(!buf) {
>       return ENOMEM;
>     }
> ...
> {code}
> For large groups, this implies at least 2 queries for the group (number of queries = math.ceil(math.log(response_size/1024, 2)))
> In the case of a large cluster with central user/group databases (exposed via LDAP etc), this leads to unnecessary load on the central services. This can be alleviated to a large extent by changing the initial buffer size to a configurable parameter
> --

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HADOOP-7160) Configurable initial buffersize for getGroupDetails()

Posted by "Allen Wittenauer (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-7160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13002077#comment-13002077 ] 

Allen Wittenauer commented on HADOOP-7160:
------------------------------------------

What happens if you increase the nscd buffer size?

> Configurable initial buffersize for getGroupDetails()
> -----------------------------------------------------
>
>                 Key: HADOOP-7160
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7160
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: native, security
>    Affects Versions: 0.22.0
>            Reporter: T Meyarivan
>
> {code:title=trunk/src/native/src/org/apache/hadoop/security/getGroup.c|borderStyle=solid}
> int getGroupDetails(gid_t group, char **grpBuf) {
>   struct group * grp = NULL;
>   size_t currBufferSize = sysconf(_SC_GETGR_R_SIZE_MAX);
>   if (currBufferSize < 1024) {
>     currBufferSize = 1024;
>   }
>   *grpBuf = NULL; 
>   char *buf = (char*)malloc(sizeof(char) * currBufferSize);
>   if (!buf) {
>     return ENOMEM;
>   }
>   int error;
>   for (;;) {
>     error = getgrgid_r(group, (struct group*)buf,
>                        buf + sizeof(struct group),
>                        currBufferSize - sizeof(struct group), &grp);
>     if(error != ERANGE) {
>        break;
>     }
>     free(buf);
>     currBufferSize *= 2;
>     buf = malloc(sizeof(char) * currBufferSize);
>     if(!buf) {
>       return ENOMEM;
>     }
> ...
> {code}
> For large groups, this implies at least 2 queries for the group (number of queries = math.ceil(math.log(response_size/1024, 2)))
> In the case of a large cluster with central user/group databases (exposed via LDAP etc), this leads to unnecessary load on the central services. This can be alleviated to a large extent by changing the initial buffer size to a configurable parameter
> --

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HADOOP-7160) Configurable initial buffersize for getGroupDetails()

Posted by "Allen Wittenauer (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-7160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13002186#comment-13002186 ] 

Allen Wittenauer commented on HADOOP-7160:
------------------------------------------

Even with tuning of negative-time-to-live, positive-time-to-live, etc?

> Configurable initial buffersize for getGroupDetails()
> -----------------------------------------------------
>
>                 Key: HADOOP-7160
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7160
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: native, security
>    Affects Versions: 0.22.0
>            Reporter: T Meyarivan
>
> {code:title=trunk/src/native/src/org/apache/hadoop/security/getGroup.c|borderStyle=solid}
> int getGroupDetails(gid_t group, char **grpBuf) {
>   struct group * grp = NULL;
>   size_t currBufferSize = sysconf(_SC_GETGR_R_SIZE_MAX);
>   if (currBufferSize < 1024) {
>     currBufferSize = 1024;
>   }
>   *grpBuf = NULL; 
>   char *buf = (char*)malloc(sizeof(char) * currBufferSize);
>   if (!buf) {
>     return ENOMEM;
>   }
>   int error;
>   for (;;) {
>     error = getgrgid_r(group, (struct group*)buf,
>                        buf + sizeof(struct group),
>                        currBufferSize - sizeof(struct group), &grp);
>     if(error != ERANGE) {
>        break;
>     }
>     free(buf);
>     currBufferSize *= 2;
>     buf = malloc(sizeof(char) * currBufferSize);
>     if(!buf) {
>       return ENOMEM;
>     }
> ...
> {code}
> For large groups, this implies at least 2 queries for the group (number of queries = math.ceil(math.log(response_size/1024, 2)))
> In the case of a large cluster with central user/group databases (exposed via LDAP etc), this leads to unnecessary load on the central services. This can be alleviated to a large extent by changing the initial buffer size to a configurable parameter
> --

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HADOOP-7160) Configurable initial buffersize for getGroupDetails()

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-7160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13002257#comment-13002257 ] 

Todd Lipcon commented on HADOOP-7160:
-------------------------------------

so the sysconf result is returning a too-small value as well, I guess?

Rather than making this configurable, it seems we could bump it to something like 64KB - I can't imagine the extra memory usage would harm anyone. How big a buffer do you need for your groups?

> Configurable initial buffersize for getGroupDetails()
> -----------------------------------------------------
>
>                 Key: HADOOP-7160
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7160
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: native, security
>    Affects Versions: 0.22.0
>            Reporter: T Meyarivan
>
> {code:title=trunk/src/native/src/org/apache/hadoop/security/getGroup.c|borderStyle=solid}
> int getGroupDetails(gid_t group, char **grpBuf) {
>   struct group * grp = NULL;
>   size_t currBufferSize = sysconf(_SC_GETGR_R_SIZE_MAX);
>   if (currBufferSize < 1024) {
>     currBufferSize = 1024;
>   }
>   *grpBuf = NULL; 
>   char *buf = (char*)malloc(sizeof(char) * currBufferSize);
>   if (!buf) {
>     return ENOMEM;
>   }
>   int error;
>   for (;;) {
>     error = getgrgid_r(group, (struct group*)buf,
>                        buf + sizeof(struct group),
>                        currBufferSize - sizeof(struct group), &grp);
>     if(error != ERANGE) {
>        break;
>     }
>     free(buf);
>     currBufferSize *= 2;
>     buf = malloc(sizeof(char) * currBufferSize);
>     if(!buf) {
>       return ENOMEM;
>     }
> ...
> {code}
> For large groups, this implies at least 2 queries for the group (number of queries = math.ceil(math.log(response_size/1024, 2)))
> In the case of a large cluster with central user/group databases (exposed via LDAP etc), this leads to unnecessary load on the central services. This can be alleviated to a large extent by changing the initial buffer size to a configurable parameter
> --

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (HADOOP-7160) Configurable initial buffersize for getGroupDetails()

Posted by "T Meyarivan (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-7160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

T Meyarivan updated HADOOP-7160:
--------------------------------

    Description: 
{code:title=trunk/src/native/src/org/apache/hadoop/security/getGroup.c|borderStyle=solid}
int getGroupDetails(gid_t group, char **grpBuf) {
  struct group * grp = NULL;
  size_t currBufferSize = sysconf(_SC_GETGR_R_SIZE_MAX);
  if (currBufferSize < 1024) {
    currBufferSize = 1024;
  }
  *grpBuf = NULL; 
  char *buf = (char*)malloc(sizeof(char) * currBufferSize);

  if (!buf) {
    return ENOMEM;
  }
  int error;
  for (;;) {
    error = getgrgid_r(group, (struct group*)buf,
                       buf + sizeof(struct group),
                       currBufferSize - sizeof(struct group), &grp);
    if(error != ERANGE) {
       break;
    }
    free(buf);
    currBufferSize *= 2;
    buf = malloc(sizeof(char) * currBufferSize);
    if(!buf) {
      return ENOMEM;
    }
...
{code}

For large groups, this implies at least 2 queries for the group (number of queries = math.ceil(math.log(response_size/1024, 2)))

In the case of a large cluster with central user/group databases (exposed via LDAP etc), this leads to unnecessary load on the central services. This can be alleviated to a large extent by changing the initial buffer size to a configurable parameter

--

  was:
trunk/src/native/src/org/apache/hadoop/security/getGroup.c

"""
int getGroupDetails(gid_t group, char **grpBuf) {
  struct group * grp = NULL;
  size_t currBufferSize = sysconf(_SC_GETGR_R_SIZE_MAX);
  if (currBufferSize < 1024) {
    currBufferSize = 1024;
  }
  *grpBuf = NULL; 
  char *buf = (char*)malloc(sizeof(char) * currBufferSize);

  if (!buf) {
    return ENOMEM;
  }
  int error;
  for (;;) {
    error = getgrgid_r(group, (struct group*)buf,
                       buf + sizeof(struct group),
                       currBufferSize - sizeof(struct group), &grp);
    if(error != ERANGE) {
       break;
    }
    free(buf);
    currBufferSize *= 2;
    buf = malloc(sizeof(char) * currBufferSize);
    if(!buf) {
      return ENOMEM;
    }
...
"""

For large groups, this implies at least 2 queries for the group (number of queries = math.ceil(math.log(response_size/1024, 2)))

In the case of a large cluster with central user/group databases (exposed via LDAP etc), this leads to unnecessary load on the central services. This can be alleviated to a large extent by changing the initial buffer size to a configurable parameter

--


> Configurable initial buffersize for getGroupDetails()
> -----------------------------------------------------
>
>                 Key: HADOOP-7160
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7160
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: native, security
>    Affects Versions: 0.22.0
>            Reporter: T Meyarivan
>
> {code:title=trunk/src/native/src/org/apache/hadoop/security/getGroup.c|borderStyle=solid}
> int getGroupDetails(gid_t group, char **grpBuf) {
>   struct group * grp = NULL;
>   size_t currBufferSize = sysconf(_SC_GETGR_R_SIZE_MAX);
>   if (currBufferSize < 1024) {
>     currBufferSize = 1024;
>   }
>   *grpBuf = NULL; 
>   char *buf = (char*)malloc(sizeof(char) * currBufferSize);
>   if (!buf) {
>     return ENOMEM;
>   }
>   int error;
>   for (;;) {
>     error = getgrgid_r(group, (struct group*)buf,
>                        buf + sizeof(struct group),
>                        currBufferSize - sizeof(struct group), &grp);
>     if(error != ERANGE) {
>        break;
>     }
>     free(buf);
>     currBufferSize *= 2;
>     buf = malloc(sizeof(char) * currBufferSize);
>     if(!buf) {
>       return ENOMEM;
>     }
> ...
> {code}
> For large groups, this implies at least 2 queries for the group (number of queries = math.ceil(math.log(response_size/1024, 2)))
> In the case of a large cluster with central user/group databases (exposed via LDAP etc), this leads to unnecessary load on the central services. This can be alleviated to a large extent by changing the initial buffer size to a configurable parameter
> --

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HADOOP-7160) Configurable initial buffersize for getGroupDetails()

Posted by "T Meyarivan (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-7160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13002085#comment-13002085 ] 

T Meyarivan commented on HADOOP-7160:
-------------------------------------

If the query is successful, nscd caches it for time X - once nscd caches the query, there is little or no (extra) load on remote servers till it expires

--

> Configurable initial buffersize for getGroupDetails()
> -----------------------------------------------------
>
>                 Key: HADOOP-7160
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7160
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: native, security
>    Affects Versions: 0.22.0
>            Reporter: T Meyarivan
>
> {code:title=trunk/src/native/src/org/apache/hadoop/security/getGroup.c|borderStyle=solid}
> int getGroupDetails(gid_t group, char **grpBuf) {
>   struct group * grp = NULL;
>   size_t currBufferSize = sysconf(_SC_GETGR_R_SIZE_MAX);
>   if (currBufferSize < 1024) {
>     currBufferSize = 1024;
>   }
>   *grpBuf = NULL; 
>   char *buf = (char*)malloc(sizeof(char) * currBufferSize);
>   if (!buf) {
>     return ENOMEM;
>   }
>   int error;
>   for (;;) {
>     error = getgrgid_r(group, (struct group*)buf,
>                        buf + sizeof(struct group),
>                        currBufferSize - sizeof(struct group), &grp);
>     if(error != ERANGE) {
>        break;
>     }
>     free(buf);
>     currBufferSize *= 2;
>     buf = malloc(sizeof(char) * currBufferSize);
>     if(!buf) {
>       return ENOMEM;
>     }
> ...
> {code}
> For large groups, this implies at least 2 queries for the group (number of queries = math.ceil(math.log(response_size/1024, 2)))
> In the case of a large cluster with central user/group databases (exposed via LDAP etc), this leads to unnecessary load on the central services. This can be alleviated to a large extent by changing the initial buffer size to a configurable parameter
> --

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HADOOP-7160) Configurable initial buffersize for getGroupDetails()

Posted by "Joep Rottinghuis (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-7160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13002724#comment-13002724 ] 

Joep Rottinghuis commented on HADOOP-7160:
------------------------------------------

Avoiding multiple round-trips seems like a good idea.

If you make it configurable, it would be good to have a way to be able to tell how large the value should be set.
The same mechanism can be used to answer Todd's question.


> Configurable initial buffersize for getGroupDetails()
> -----------------------------------------------------
>
>                 Key: HADOOP-7160
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7160
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: native, security
>    Affects Versions: 0.22.0
>            Reporter: T Meyarivan
>
> {code:title=trunk/src/native/src/org/apache/hadoop/security/getGroup.c|borderStyle=solid}
> int getGroupDetails(gid_t group, char **grpBuf) {
>   struct group * grp = NULL;
>   size_t currBufferSize = sysconf(_SC_GETGR_R_SIZE_MAX);
>   if (currBufferSize < 1024) {
>     currBufferSize = 1024;
>   }
>   *grpBuf = NULL; 
>   char *buf = (char*)malloc(sizeof(char) * currBufferSize);
>   if (!buf) {
>     return ENOMEM;
>   }
>   int error;
>   for (;;) {
>     error = getgrgid_r(group, (struct group*)buf,
>                        buf + sizeof(struct group),
>                        currBufferSize - sizeof(struct group), &grp);
>     if(error != ERANGE) {
>        break;
>     }
>     free(buf);
>     currBufferSize *= 2;
>     buf = malloc(sizeof(char) * currBufferSize);
>     if(!buf) {
>       return ENOMEM;
>     }
> ...
> {code}
> For large groups, this implies at least 2 queries for the group (number of queries = math.ceil(math.log(response_size/1024, 2)))
> In the case of a large cluster with central user/group databases (exposed via LDAP etc), this leads to unnecessary load on the central services. This can be alleviated to a large extent by changing the initial buffer size to a configurable parameter
> --

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HADOOP-7160) Configurable initial buffersize for getGroupDetails()

Posted by "T Meyarivan (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-7160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13002828#comment-13002828 ] 

T Meyarivan commented on HADOOP-7160:
-------------------------------------

tlipcon@:

Not sure about fixed limits. If it is important to avoid another config parameter, how about adjusting the initial buffer size based on the response size 

--


> Configurable initial buffersize for getGroupDetails()
> -----------------------------------------------------
>
>                 Key: HADOOP-7160
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7160
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: native, security
>    Affects Versions: 0.22.0
>            Reporter: T Meyarivan
>
> {code:title=trunk/src/native/src/org/apache/hadoop/security/getGroup.c|borderStyle=solid}
> int getGroupDetails(gid_t group, char **grpBuf) {
>   struct group * grp = NULL;
>   size_t currBufferSize = sysconf(_SC_GETGR_R_SIZE_MAX);
>   if (currBufferSize < 1024) {
>     currBufferSize = 1024;
>   }
>   *grpBuf = NULL; 
>   char *buf = (char*)malloc(sizeof(char) * currBufferSize);
>   if (!buf) {
>     return ENOMEM;
>   }
>   int error;
>   for (;;) {
>     error = getgrgid_r(group, (struct group*)buf,
>                        buf + sizeof(struct group),
>                        currBufferSize - sizeof(struct group), &grp);
>     if(error != ERANGE) {
>        break;
>     }
>     free(buf);
>     currBufferSize *= 2;
>     buf = malloc(sizeof(char) * currBufferSize);
>     if(!buf) {
>       return ENOMEM;
>     }
> ...
> {code}
> For large groups, this implies at least 2 queries for the group (number of queries = math.ceil(math.log(response_size/1024, 2)))
> In the case of a large cluster with central user/group databases (exposed via LDAP etc), this leads to unnecessary load on the central services. This can be alleviated to a large extent by changing the initial buffer size to a configurable parameter
> --

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira