You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Aaron T. Myers (JIRA)" <ji...@apache.org> on 2011/03/30 18:20:05 UTC

[jira] [Created] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Hadoop /usr/bin/groups equivalent
---------------------------------

                 Key: HADOOP-7214
                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
             Project: Hadoop Common
          Issue Type: New Feature
    Affects Versions: 0.23.0
            Reporter: Aaron T. Myers
            Assignee: Aaron T. Myers


Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Aaron T. Myers (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Aaron T. Myers updated HADOOP-7214:
-----------------------------------

    Attachment: hadoop-7214.4.txt

D'oh! Forgot to use --no-prefix when generating the patch.

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.2.txt, hadoop-7214.3.txt, hadoop-7214.4.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13023062#comment-13023062 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-7214:
------------------------------------------------

Hi Todd, here are two more:

*Concern 4*: Should it belong to the {{FileSystem}}/{{FileContext}} API?

*Concern 5*: Should this service run separately, i.e. outside NameNode?


> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.2.txt, hadoop-7214.3.txt, hadoop-7214.4.txt, hadoop-7214.5.txt, hadoop-7214.6.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Aaron T. Myers (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020034#comment-13020034 ] 

Aaron T. Myers commented on HADOOP-7214:
----------------------------------------

bq. Here is more details: NN and JT resolve groups by invoking the configured GroupMappingServiceProvider, which might connect to another group server (e.g. LDAP). Could the shell command use the same configuration to obtain GroupMappingServiceProvider so that it could talk to the group server directly?

That at least won't work in the case of the default configuration, i.e. using {{ShellBasedUnixGroupsMapping}}. In this case the groups a user belongs to are determined simply by shelling out to `{{id -Gn <username>}}'. If that were run on the client machine, it might get the wrong groups for the user.

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.2.txt, hadoop-7214.3.txt, hadoop-7214.4.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Aaron T. Myers (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13016632#comment-13016632 ] 

Aaron T. Myers commented on HADOOP-7214:
----------------------------------------

Allen, I've previously seen you describe Hadoop as "essentially a distributed, networked operating system." I agree with that assessment. Presently, this OS provides {{chown}}, {{chgrp}}, and {{chmod}}, all of which presuppose the existence of groups associated with a user. However, this OS doesn't provide a way for a user to find out what groups they're a member of from the OS's perspective. I'm surprised you're resistant to adding this functionality. It seems to me to be a simple deficiency.

Detailed responses to your concerns are inline.

bq. My conclusion is simple: if there is an easy fix for this, go for it. But if we're inventing a bunch of tool-age to support brokenness, I think it is bad to add the weight long term (yes, this includes RESTs, RPCs, etc, etc).

You still haven't answered my question: which setup that I described above do you consider "broken" ?

As I said previously, I can probably agree that having different users/groups on the NN vs the JT is indeed a misconfiguration, and we shouldn't concern ourselves with that scenario. But, do you also consider having different users/groups on the client machine vs the NN to be a misconfiguration? That seems like a perfectly reasonable setup to me, and one that we should support.

bq. BTW, it is also perfectly reasonable to expect that companies that decide to have split naming services to provide ways to query that information on their own.

Perhaps, but Hadoop also supports making the user -> group mapping service pluggable via the {{hadoop.security.group.mapping}} configuration parameter. Why should we require implementers of this to provide a way of querying this information on their own, through some other mechanism, rather than have Hadoop show it? When a Hadoop user gets a "permission denied" error from a Hadoop command, and wants to know what groups Hadoop thinks they belong to, they'll have to run "{{random-command-x}}" rather than something simple like "{{hadoop fs -groups}}". That only seems to make Hadoop harder to use.

bq. There could be some potential security issues that we might be circumventing by providing that information out-of-band.

Hadoop assumes that file system implementations are capable of associating files and directories with users and groups, as HDFS does. That's already part of the existing Hadoop commands. A user could presently determine what groups they're a member of by creating a file and then trying to {{chgrp}} it to different things. The set of inputs for which the {{chgrp}} succeeded would be the set of groups the user is a member of. Obviously, this isn't feasible for a normal user to do when they get a {{PermissionDeniedException}}, but it's perfectly reasonable for an attacker to do.

My point is just that Hadoop isn't hiding this information as it stands. Hadoop makes decisions based on the groups a user belongs to, so we should make it easy for our users to find out what groups Hadoop thinks they belong to.

bq. The other thing to keep in mind that going down the path of 'hadoop groups' is too limiting. If we are going to provide group information, why not also provide uid, username, etc.

Showing the username seems reasonable to me, and in fact the patch I'm working on displays this. Hadoop doesn't make decisions based on one's UID, so why should we show that?

bq. In the case of a kerberized environment, there is no guarantee that the TGT info matches what is actually executed on the compute nodes due to remapping...

I don't follow this reasoning. Kerberos doesn't have any notion of groups. But, the first component of the Kerberos principal name is used as the username when the NN and JT determine a user's groups. I don't see how we need to account for anything differently with or without Kerberos support enabled.

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Daryn Sharp (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13016860#comment-13016860 ] 

Daryn Sharp commented on HADOOP-7214:
-------------------------------------

bq. ...except Hadoop is told what groups a user belongs to by an external source. Why shouldn't it be the responsibility of the external source to share this information? We're the consumer, not the provider when it comes to naming services.

Leaving it up to the client to implement groups creates a non-portable and brittle/maintenance-prone solution.  Ex. must know the source(s) of authority, how to interact with the authorities, dealing with possible ACL issues, knowing any additional constraints, keeping pace with changes to the environment, etc.

I found the following insightful comment in the security code.

{noformat}
A user-to-groups mapping service.
*
{@link Groups} allows for server to get the various group memberships
of a given user via the {@link #getGroups(String)} call, thus ensuring
a consistent user-to-groups mapping and protects against vagaries of
different mappings on servers and clients in a Hadoop cluster.
{noformat}

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Aaron T. Myers (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020079#comment-13020079 ] 

Aaron T. Myers commented on HADOOP-7214:
----------------------------------------

bq. Yes, it makes sense to let user to determine their group membership. In case of LDAP, clients should talk to the LDAP server directly. In case of shell, it may be better to run a separated group server/process for the clients outside the domain.

Such an architecture only serves to make the system more brittle. Why should a user or client program be concerned with *how* the NN is configured to perform its user/group mapping? Why is it reasonable to assume that the LDAP server configured to provide user -> group mapping on the NN is accessible or configured from the client machine?

bq. NN is already a bottleneck of the system. We don't want to overload it with other functionality.

I doubt seriously this new functionality will have hardly any detectable performance impact on the NN. The NN already performs this user -> group mapping literally every time a file or directory access is performed. It caches the results of these aggressively. A few users occasionally running this command should only negligibly increase load on the NN.

bq. I think we need some helps here. Could any security expert comment on this?

What security concerns do you still have regarding this patch? These concerns seem mostly focused around the implementation, not on the security semantics.

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.2.txt, hadoop-7214.3.txt, hadoop-7214.4.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13021473#comment-13021473 ] 

Hadoop QA commented on HADOOP-7214:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12476585/hadoop-7214.5.txt
  against trunk revision 1094750.

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

    +1 system test framework.  The patch passed system test framework compile.

Test results: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/354//testReport/
Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/354//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/354//console

This message is automatically generated.

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.2.txt, hadoop-7214.3.txt, hadoop-7214.4.txt, hadoop-7214.5.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13023058#comment-13023058 ] 

Hadoop QA commented on HADOOP-7214:
-----------------------------------

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12477060/hadoop-7214.6.txt
  against trunk revision 1095761.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 6 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

    +1 system test framework.  The patch passed system test framework compile.

Test results: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/372//testReport/
Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/372//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/372//console

This message is automatically generated.

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.2.txt, hadoop-7214.3.txt, hadoop-7214.4.txt, hadoop-7214.5.txt, hadoop-7214.6.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Aaron T. Myers (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Aaron T. Myers updated HADOOP-7214:
-----------------------------------

    Attachment: hadoop-7214.1.txt

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13031493#comment-13031493 ] 

Todd Lipcon commented on HADOOP-7214:
-------------------------------------

Looking good. I think we need to update src/docs/src/documentation/content/xdocs/service_level_auth.xml and also conf/hadoop-policy.xml.template. But let's open another JIRA, since that file is out-of-date for other ACLs as well.

+1 for this patch.

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.10.patch, hadoop-7214.11.patch, hadoop-7214.2.txt, hadoop-7214.3.txt, hadoop-7214.4.txt, hadoop-7214.5.txt, hadoop-7214.6.txt, hadoop-7214.7.patch, hadoop-7214.8.patch, hadoop-7214.9.patch
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029125#comment-13029125 ] 

Todd Lipcon commented on HADOOP-7214:
-------------------------------------

- What's the point of the "helper method to get the current user" since it's just wrapping another method directly? Seems silly.
- We need to make sure we commit this at the same time as we commit the MR and HDFS changes, right? (ie this will break the build if we don't?)

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.2.txt, hadoop-7214.3.txt, hadoop-7214.4.txt, hadoop-7214.5.txt, hadoop-7214.6.txt, hadoop-7214.7.patch, hadoop-7214.8.patch
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Aaron T. Myers (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13017025#comment-13017025 ] 

Aaron T. Myers commented on HADOOP-7214:
----------------------------------------

{quote}
Thanks Aaron. According to the current convention, I think the usage string probably should be

{code}
GROUPS_USAGE="-groups [<username...>]";
{code}
{quote}

Thanks for the review, Nicholas. I struggled with what format to use here. The existing conventions for the {{chmod}}, {{chown}}, and {{chgrp}} Hadoop commands would seem to agree with what I presently have in the patch, as would the man page for the Unix groups command:

{noformat}
SYNOPSIS
       groups [OPTION]... [USERNAME]...
{noformat}

That said, I don't feel strongly about this. Just pointing out that the correct solution is non-obvious. If you think the format your proposed is correct, I'm happy to roll with that.

Please let me know what to do.

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13031645#comment-13031645 ] 

Hudson commented on HADOOP-7214:
--------------------------------

Integrated in Hadoop-Common-trunk #685 (See [https://builds.apache.org/hudson/job/Hadoop-Common-trunk/685/])
    HADOOP-7214. Add Common functionality necessary to provide an equivalent of /usr/bin/groups for Hadoop. Contributed by Aaron T. Myers.


> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>             Fix For: 0.23.0
>
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.10.patch, hadoop-7214.11.patch, hadoop-7214.2.txt, hadoop-7214.3.txt, hadoop-7214.4.txt, hadoop-7214.5.txt, hadoop-7214.6.txt, hadoop-7214.7.patch, hadoop-7214.8.patch, hadoop-7214.9.patch
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Aaron T. Myers (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13017288#comment-13017288 ] 

Aaron T. Myers commented on HADOOP-7214:
----------------------------------------

Hey Daryn,

bq. Make sure it outputs a posix-like error of "groups: foo: no such user".

I actually ended up not doing this, since in Hadoop it's actually possible for a user to legitimately have no groups at all (for example, with an S3 file system) and because the {{ShellBasedUnixGroupsMapping}} class doesn't actually bubble up the "no such user" case - it just returns an empty list.

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.2.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13017760#comment-13017760 ] 

Hadoop QA commented on HADOOP-7214:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12475865/hadoop-7214.4.txt
  against trunk revision 1090485.

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

    +1 system test framework.  The patch passed system test framework compile.

Test results: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/339//testReport/
Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/339//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/339//console

This message is automatically generated.

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.2.txt, hadoop-7214.3.txt, hadoop-7214.4.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Aaron T. Myers (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Aaron T. Myers updated HADOOP-7214:
-----------------------------------

    Status: Patch Available  (was: Open)

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Aaron T. Myers (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13014888#comment-13014888 ] 

Aaron T. Myers commented on HADOOP-7214:
----------------------------------------

I did a little digging into this and the ideal implementation is not as obvious to me as I anticipated. Here's a few ideas:

# Given that both the NN and the JT do user -> group resolution, it'd be nice to implement this in a way common to both of them. This suggests creating a new protocol common to both just for this purpose, and adding a new option to {{`hadoop'}} along the lines of {{`hadoop groups <host:port> [username...]'}}.
\\
# The hadoop filesystem client {{i.e. `hadoop fs'}} already has commands which assume the existence of groups (e.g. `hadoop fs -chgrp'.) It therefore seems natural to implement this as a new method in FileSystem and to add a new option to {{`hadoop fs'}} along the lines of {{`hadoop fs -groups <host:port> [username...]'}}. This also seems natural since most file system implementations already deal with groups and have an obvious way of performing user -> group mapping.
\\
# We could implement this as just a servlet, along the lines of {{`hadoop daemonlog'}}.

Option #1 seems like it might be a little too heavyweight for this purpose. The trouble with option #2 is that there's presently no M/R analog to {{`hadoop fs'}} like, for example, {{`hadoop mr'}}. I don't actually think option #3 is a good idea at all, but I listed it here for completeness.

Any thoughts? I'm certainly open to any options not listed here.

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Aaron T. Myers (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Aaron T. Myers updated HADOOP-7214:
-----------------------------------

    Status: Patch Available  (was: Open)

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.2.txt, hadoop-7214.3.txt, hadoop-7214.4.txt, hadoop-7214.5.txt, hadoop-7214.6.txt, hadoop-7214.7.patch
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13019667#comment-13019667 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-7214:
------------------------------------------------

> Any thoughts? I'm certainly open to any options not listed here.

Hi Aaron, 

You are right that NN and JT are doing group resolution but they are not the group service provider.   So, it may be better to let the command to directly invoke service provider (e.g. LDAP, Unix shell, etc.)  What do you think?

Sorry for this late suggestion.

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.2.txt, hadoop-7214.3.txt, hadoop-7214.4.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Allen Wittenauer (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13017221#comment-13017221 ] 

Allen Wittenauer commented on HADOOP-7214:
------------------------------------------

bq. Why? On every system I've ever used this hasn't been the case, so I don't see why it should be the case here.

Because that isn't true on every system that Hadoop may consume.  I suspect that we actually have a higher level vulnerability here if the method isn't protected either.

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13017733#comment-13017733 ] 

Hadoop QA commented on HADOOP-7214:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12475863/hadoop-7214.3.txt
  against trunk revision 1090039.

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    -1 patch.  The patch command could not apply the patch.

Console output: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/338//console

This message is automatically generated.

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.2.txt, hadoop-7214.3.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Allen Wittenauer (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13017799#comment-13017799 ] 

Allen Wittenauer commented on HADOOP-7214:
------------------------------------------

bq. Can you provide an example of such a system? I can say with a high degree of confidence that the authorization semantics of the current patch are the same as that for the vast majority of the systems that Hadoop is currently deployed on.

In particular, I'm thinking of MS PACs, which I don't think are exposed in the same way UNIX groups are. I can easily see someone implementing a user-to-group plug-in to use them, given the staggering amount of Active Directory deployments.

bq. My inclination is to implement it like this, and if a user of this hypothetical system which you refer to wants it at some point in the future, we can make it a configuration option at the NN. How does that sound?

I think we likely have an API issue.  When that gets fixed, this should fall in line with it.  I'm going to have an off-JIRA discussion with some security folks to see if this is truly something to worry about.


> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.2.txt, hadoop-7214.3.txt, hadoop-7214.4.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Aaron T. Myers (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13017021#comment-13017021 ] 

Aaron T. Myers commented on HADOOP-7214:
----------------------------------------

Thanks a lot for the review, Daryn. I'll upload a patch shortly addressing your comments.

bq. There's no need to catch an illegal arg exception. The caller will catch the exception and print the usage.

Good catch. Will fix.

bq. Should print out the line for the user immediately, instead of batching up all the output for all users and printing it all at the same time. An exception for one user would prevent any output.

Good point. Will fix.

bq. However, the unix "groups" command only takes one argument so we may want to conform to that.

At least on my system (Ubuntu Maverick) this isn't the case:

{code}
$ /usr/bin/groups atm hdfs mapred
atm : atm adm dialout cdrom plugdev lpadmin admin sambashare
hdfs : hdfs hadoop
mapred : mapred hadoop
{code}

bq. Make sure it outputs a posix-like error of "groups: foo: no such user".

Will do.

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Sanjay Radia (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020135#comment-13020135 ] 

Sanjay Radia commented on HADOOP-7214:
--------------------------------------

Thinking a little bit further, one could argue that since Hadoop derives user accounts and membership from 
its environment, one should ask the environment about group membership:  that is,  use the groups command of 
your environment (typically a unix system).
Perhaps the main motivation for Hadoop-groups-command is to detect if Hadoop has been correctly configured 
for the environment. 

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.2.txt, hadoop-7214.3.txt, hadoop-7214.4.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029197#comment-13029197 ] 

Hadoop QA commented on HADOOP-7214:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12478254/hadoop-7214.10.patch
  against trunk revision 1099633.

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

    +1 system test framework.  The patch passed system test framework compile.

Test results: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/404//testReport/
Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/404//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/404//console

This message is automatically generated.

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.10.patch, hadoop-7214.2.txt, hadoop-7214.3.txt, hadoop-7214.4.txt, hadoop-7214.5.txt, hadoop-7214.6.txt, hadoop-7214.7.patch, hadoop-7214.8.patch, hadoop-7214.9.patch
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029476#comment-13029476 ] 

Todd Lipcon commented on HADOOP-7214:
-------------------------------------

Looks like the JavaDoc still refers to RefreshUserMappingsProtocol where it should refer to GetUserMappingsProtocol.


> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.10.patch, hadoop-7214.2.txt, hadoop-7214.3.txt, hadoop-7214.4.txt, hadoop-7214.5.txt, hadoop-7214.6.txt, hadoop-7214.7.patch, hadoop-7214.8.patch, hadoop-7214.9.patch
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020057#comment-13020057 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-7214:
------------------------------------------------

- In {{ShellBasedUnixGroupsMapping}}, is it valid to assume that they are in the same group administrative domain?

- How about introduce a separated group proxy server for the clients outside the domain?

It seems not right to use NN as a proxy service for group resolution.

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.2.txt, hadoop-7214.3.txt, hadoop-7214.4.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020032#comment-13020032 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-7214:
------------------------------------------------

Here is more details: NN and JT resolve groups by invoking the configured {{GroupMappingServiceProvider}}, which might connect to another group server (e.g. LDAP).  Could the shell command use the same configuration to obtain {{GroupMappingServiceProvider}} so that it could talk to the group server directly?

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.2.txt, hadoop-7214.3.txt, hadoop-7214.4.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Aaron T. Myers (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Aaron T. Myers updated HADOOP-7214:
-----------------------------------

    Attachment: hadoop-7214.0.txt

Patch creating a "hadoop fs -groups" command.

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Aaron T. Myers (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Aaron T. Myers updated HADOOP-7214:
-----------------------------------

    Attachment: hadoop-7214.6.txt

Rebase patch against trunk. Moves the tests for -help into common from HDFS per HADOOP-7230.

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.2.txt, hadoop-7214.3.txt, hadoop-7214.4.txt, hadoop-7214.5.txt, hadoop-7214.6.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13017700#comment-13017700 ] 

Hadoop QA commented on HADOOP-7214:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12475784/hadoop-7214.2.txt
  against trunk revision 1090039.

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    -1 javadoc.  The javadoc tool appears to have generated 1 warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    -1 core tests.  The patch failed these core unit tests:
                  org.apache.hadoop.fs.TestFilterFileSystem

    +1 contrib tests.  The patch passed contrib unit tests.

    +1 system test framework.  The patch passed system test framework compile.

Test results: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/337//testReport/
Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/337//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/337//console

This message is automatically generated.

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.2.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Aaron T. Myers (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Aaron T. Myers updated HADOOP-7214:
-----------------------------------

    Attachment: hadoop-7214.1.txt

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Aaron T. Myers (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020062#comment-13020062 ] 

Aaron T. Myers commented on HADOOP-7214:
----------------------------------------

bq. In ShellBasedUnixGroupsMapping, is it valid to assume that they are in the same group administrative domain?

Not necessarily, but as I've mentioned previously, Hadoop already exposes this information, just not easily. Allen pointed out that it's available as part of the job conf. I also described a way a user could deduce what groups they belong to by repeatedly calling {{chgrp}}.

My point is just that Hadoop isn't hiding this information as it stands. Hadoop makes decisions based on the groups a user belongs to, so we should make it easy for our users to find out what groups Hadoop thinks they belong to.

bq. How about introduce a separated group proxy server for the clients outside the domain?

I don't fully understand the suggestion. Could you please elaborate? Is the point that you agree that users should be able to determine their group membership, but we should use some other process that isn't the NN to do it?

bq. It seems not right to use NN as a proxy service for group resolution.

I disagree. When a user interacts with HDFS, the only thing that matters with respect to groups is what groups the NN thinks they belong to. Thus, it seems perfectly natural to me to provide a way to ask the NN "what groups do you think I belong to?" Without this, it is very difficult for a user to deduce why they were denied access to a particular file/directory.

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.2.txt, hadoop-7214.3.txt, hadoop-7214.4.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Aaron T. Myers (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Aaron T. Myers updated HADOOP-7214:
-----------------------------------

    Attachment: hadoop-7214.8.patch

Patch which adds a missing license header and fixes the single javac warning.

The tests for this feature are in MR and HDFS.

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.2.txt, hadoop-7214.3.txt, hadoop-7214.4.txt, hadoop-7214.5.txt, hadoop-7214.6.txt, hadoop-7214.7.patch, hadoop-7214.8.patch
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Allen Wittenauer (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13014921#comment-13014921 ] 

Allen Wittenauer commented on HADOOP-7214:
------------------------------------------

The root of the problem is an inconsistent namespace.  Trying to work around that is always fraught with peril... which is why a lot of systems assume that it is flat/consistent across all nodes in a given network.

But one thing comes to mind:  isn't this information already reported as part of the job conf?  Shouldn't a user be able to query their job's information (at least via the web UI... probably should have something shell-ish) to determine this information?  Granted, it would be nice to know prior to job submission, but we're fundamentally looking at a debugging tool for broken environments.

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Aaron T. Myers (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13019963#comment-13019963 ] 

Aaron T. Myers commented on HADOOP-7214:
----------------------------------------

Thanks for the review, Nicholas.

bq. You are right that NN and JT are doing group resolution but they are not the group service provider. So, it may be better to let the command to directly invoke service provider (e.g. LDAP, Unix shell, etc.) What do you think?

I'm not sure I understand the suggestion. Is your point that the FsShell should just directly invoke the configured {{GroupMappingServiceProvider}}, instead of making an RPC? The reason I went the RPC route is precisely to account for the case that there are different sets of users/groups on the client vs. the NN or JT.

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.2.txt, hadoop-7214.3.txt, hadoop-7214.4.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Aaron T. Myers (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13016665#comment-13016665 ] 

Aaron T. Myers commented on HADOOP-7214:
----------------------------------------

Thanks for the review, Nicholas. Updated patch attached.

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029035#comment-13029035 ] 

Hadoop QA commented on HADOOP-7214:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12478225/hadoop-7214.7.patch
  against trunk revision 1099633.

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    -1 javac.  The applied patch generated 1071 javac compiler warnings (more than the trunk's current 1070 warnings).

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    -1 release audit.  The applied patch generated 2 release audit warnings (more than the trunk's current 1 warnings).

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

    +1 system test framework.  The patch passed system test framework compile.

Test results: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/400//testReport/
Release audit warnings: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/400//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/400//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/400//console

This message is automatically generated.

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.2.txt, hadoop-7214.3.txt, hadoop-7214.4.txt, hadoop-7214.5.txt, hadoop-7214.6.txt, hadoop-7214.7.patch
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Aaron T. Myers (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Aaron T. Myers updated HADOOP-7214:
-----------------------------------

    Attachment: hadoop-7214.9.patch

Updated patch addressing Todd's comments.

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.2.txt, hadoop-7214.3.txt, hadoop-7214.4.txt, hadoop-7214.5.txt, hadoop-7214.6.txt, hadoop-7214.7.patch, hadoop-7214.8.patch, hadoop-7214.9.patch
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Daryn Sharp (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13016870#comment-13016870 ] 

Daryn Sharp commented on HADOOP-7214:
-------------------------------------

Regarding the implementation:

There's no need to catch an illegal arg exception.  The caller will catch the exception and print the usage.

Should print out the line for the user immediately, instead of batching up all the output for all users and printing it all at the same time.  An exception for one user would prevent any output.  However, the unix "groups" command only takes one argument so we may want to conform to that.

Make sure it outputs a posix-like error of "groups: foo: no such user".

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13017028#comment-13017028 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-7214:
------------------------------------------------

Hey Aaron, you are right that we actually have two conventions.  I missed it.  The one you took, which is closer to Unix, seems better.  Thanks for pointing it out.

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Aaron T. Myers (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Aaron T. Myers updated HADOOP-7214:
-----------------------------------

    Attachment: hadoop-7214.2.txt

Updated patch to address Daryn's comments and rebased against trunk.

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.2.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Aaron T. Myers (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13019620#comment-13019620 ] 

Aaron T. Myers commented on HADOOP-7214:
----------------------------------------

Thanks a lot for the thoughtful comment, Allen.

bq. Should this option actually be -id instead of -groups? The reason I ask is because groups(1) is a BSD command and as folks have discovered fairly non-standardized in a variety of ways. id(1)'s params have a POSIX definition. It also allows us to add user/uid mapping capability (later?) in a POSIX-compatible way.

I considered this as well, but concluded that it should be "-groups" for the following reasons:

# Given that Hadoop doesn't presently have any notion of uids/gids, many of id(1)'s options don't make any sense.
# Given that Hadoop doesn't presently have any notion of uids/gids, Hadoop can't really mimic the output of id(1).
# There's no reason that, if Hadoop were to one day add uids/gids, both options couldn't coexist, as both id(1) and groups(1) do on most OSes.
# I _think_ most people tend to use groups(1) when determining a user's groups. I have no data to support this, it's just the impression that I'm under. :)

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.2.txt, hadoop-7214.3.txt, hadoop-7214.4.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Aaron T. Myers (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Aaron T. Myers updated HADOOP-7214:
-----------------------------------

    Attachment: hadoop-7214.5.txt

Updated patch rebased against trunk.

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.2.txt, hadoop-7214.3.txt, hadoop-7214.4.txt, hadoop-7214.5.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Aaron T. Myers (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13023314#comment-13023314 ] 

Aaron T. Myers commented on HADOOP-7214:
----------------------------------------

Sanjay and Nicholas, if Todd's suggestion is amenable to you, I'll go ahead and rework the patch to implement it as such. Please let me know.

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.2.txt, hadoop-7214.3.txt, hadoop-7214.4.txt, hadoop-7214.5.txt, hadoop-7214.6.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13031503#comment-13031503 ] 

Hudson commented on HADOOP-7214:
--------------------------------

Integrated in Hadoop-Common-trunk-Commit #587 (See [https://builds.apache.org/hudson/job/Hadoop-Common-trunk-Commit/587/])
    HADOOP-7214. Add Common functionality necessary to provide an equivalent of /usr/bin/groups for Hadoop. Contributed by Aaron T. Myers.


> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>             Fix For: 0.23.0
>
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.10.patch, hadoop-7214.11.patch, hadoop-7214.2.txt, hadoop-7214.3.txt, hadoop-7214.4.txt, hadoop-7214.5.txt, hadoop-7214.6.txt, hadoop-7214.7.patch, hadoop-7214.8.patch, hadoop-7214.9.patch
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Daryn Sharp (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13019032#comment-13019032 ] 

Daryn Sharp commented on HADOOP-7214:
-------------------------------------

Regarding "user existence", it looks like the logic is if UserGroupInformation.isSecurityEnabled() is true, then an empty group list means "No such user".  Does this mesh with your understanding of S3?

Otherwise I think it looks ok.

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.2.txt, hadoop-7214.3.txt, hadoop-7214.4.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Allen Wittenauer (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13016096#comment-13016096 ] 

Allen Wittenauer commented on HADOOP-7214:
------------------------------------------

My conclusion is simple:  if there is an easy fix for this, go for it.  But if we're inventing a bunch of tool-age to support brokenness, I think it is bad to add the weight long term (yes, this includes RESTs, RPCs, etc, etc).  Thus why I'd rather just say say that we show the groups in the job conf and let it go at that.

BTW, it is also perfectly reasonable to expect that companies that decide to have split naming services to provide ways to query that information on their own.  There could be some potential security issues that we might be circumventing by providing that information out-of-band.

The other thing to keep in mind that going down the path of 'hadoop groups' is too limiting.  If we are going to provide group information, why not also provide uid, username, etc.  In the case of a kerberized environment, there is no guarantee that the TGT info matches what is actually executed on the compute nodes due to remapping...



> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Aaron T. Myers (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13019245#comment-13019245 ] 

Aaron T. Myers commented on HADOOP-7214:
----------------------------------------

Thanks a lot for the review, Daryn.

bq. Regarding "user existence", it looks like the logic is if UserGroupInformation.isSecurityEnabled() is true, then an empty group list means "No such user". Does this mesh with your understanding of S3?

I'm under the impression that enabling Kerberos support on a Hadoop S3 file system isn't currently supported, so this isn't an issue.

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.2.txt, hadoop-7214.3.txt, hadoop-7214.4.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Allen Wittenauer (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13016707#comment-13016707 ] 

Allen Wittenauer commented on HADOOP-7214:
------------------------------------------

bq. However, this OS doesn't provide a way for a user to find out what groups they're a member of from the OS's perspective.

Hadoop doesn't ultimately 'own' those resources.  They come from an external source.  Where does username to uid mapping click in for running tasks?   All of these mappings sit outside Hadoop.

bq. But, do you also consider having different users/groups on the client machine vs the NN to be a misconfiguration? That seems like a perfectly reasonable setup to me, and one that we should support.

Yes, very much so. It breaks a lot of different services (not just Hadoop) and is confusing to users. If one insists on this foolishness, there are other ways to solve this problem that don't involve us. This is a snow flake that could easily turn into an avalanche.

FWIW, we also allow laptops to connect directly to one of our Hadoop grids. So yes, I'm very familiar and have thought a lot about this particular problem already. That's why we have other services that allow users to see what groups they belong to, their user id, etc, etc.

Again: why do we need to provide a solution inherent in the software for ultimately is a problem that is a) much larger than our software and b) can be solved without us doing anything?  Just because we *can* do something doesn't mean we *should*.

bq. Perhaps, but Hadoop also supports making the user -> group mapping service pluggable via the {{hadoop.security.group.mapping}} configuration parameter. Why should we require implementers of this to provide a way of querying this information on their own, through some other mechanism, rather than have Hadoop show it? When a Hadoop user gets a "permission denied" error from a Hadoop command, and wants to know what groups Hadoop thinks they belong to, they'll have to run "{{random-command-x}}" rather than something simple like "{{hadoop fs -groups}}". That only seems to make Hadoop harder to use.

If someone writes a pluggable module, then this is something that needs to get factored into the cost of using that plug-in. What happens if those groups aren't in a displayable format? 

Also, what happens if they aren't using the command line?  Are we going to write a jsp too?  This is going to quickly balloon out of control. :( 

bq. Hadoop assumes that file system implementations are capable of associating files and directories with users and groups, as HDFS does. 

Sort of.  There is no reason why a file system's implementation of users and groups couldn't be a nop.  (Actually, isn't that the case for S3, Cassandra, and a few other non-POSIX-likes already?)  What do we display in the case where the group is useless information for the file system in use? 

bq. My point is just that Hadoop isn't hiding this information as it stands. Hadoop makes decisions based on the groups a user belongs to, so we should make it easy for our users to find out what groups Hadoop thinks they belong to.

...except Hadoop is told what groups a user belongs to by an external source. Why shouldn't it be the responsibility of the external source to share this information?  We're the consumer, not the provider when it comes to naming services.

bq. Showing the username seems reasonable to me, and in fact the patch I'm working on displays this. Hadoop doesn't make decisions based on one's UID, so why should we show that?

...

bq. I don't follow this reasoning. Kerberos doesn't have any notion of groups. But, the first component of the Kerberos principal name is used as the username when the NN and JT determine a user's groups. I don't see how we need to account for anything differently with or without Kerberos support enabled.

Look at the bigger picture and not just focus on groups for a second:

Let's say I fire my job off with a principal of user/joe.  But thanks to remapping (HADOOP-6526), the task actually gets run as username fred with a uid of 50. I access a file on the local system (or heck, even NFS) that is not readable by fred/50.  Using the same logic of "oh noes users don't know their groups", we should be reporting this other information too.

This is a slippery slope and I really really don't think we want to go down this road.

(PS, some Kerberos implementations actually do pass group information along...)

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Aaron T. Myers (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13017809#comment-13017809 ] 

Aaron T. Myers commented on HADOOP-7214:
----------------------------------------

bq. In particular, I'm thinking of MS PACs, which I don't think are exposed in the same way UNIX groups are. I can easily see someone implementing a user-to-group plug-in to use them, given the staggering amount of Active Directory deployments. ... I think we likely have an API issue.

Is your point that an implementation of this hypothetical plugin would want to only allow a user to see their own groups, but not other user's? If so, these semantics can already be implemented with the current API. The implementation of this MS PAC user-to-group plug-in would need only to check the the currently logged-in user against the user for which the groups are being requested, and either throw an error or return an empty list in the event they do not match.

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.2.txt, hadoop-7214.3.txt, hadoop-7214.4.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Aaron T. Myers (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13014932#comment-13014932 ] 

Aaron T. Myers commented on HADOOP-7214:
----------------------------------------

Thanks for the comments, Allen. Responses inline.

bq. But one thing comes to mind: isn't this information already reported as part of the job conf? Shouldn't a user be able to query their job's information (at least via the web UI... probably should have something shell-ish) to determine this information?

Even if it were available, that might solve the problem for MR jobs, but that doesn't solve the whole problem. Suppose user {{foo}} tries to {{chgrp}} some file to group {{bar}}, but they get an error along the lines of "{{Error: user 'foo' does not belong to 'bar'}}." How should the user go about seeing what groups they are a member of, and therefore what groups they can {{chgrp}} files to? The only option at present is to {{ssh}} to the NN and run "groups".

bq. Granted, it would be nice to know prior to job submission, but we're fundamentally looking at a debugging tool for broken environments.

I don't entirely agree that this is only for debugging a broken environment. I think I over-emphasized the "different groups on NN vs. JT" portion of this issue in my original issue description. Here's a better one:

A user should be able to see what groups they are a member of from the command line. For HDFS permissions, the only groups that matter are those on the NN. For M/R permissions, the only groups that matter are those on the JT. So, whatever we implement to show the user their group membership should be able to query the NN and JT separately.

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Aaron T. Myers (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Aaron T. Myers updated HADOOP-7214:
-----------------------------------

    Attachment: hadoop-7214.11.patch

Thanks for the comments, Todd. I missed adding the file during the splitting of the protocol interfaces.

Updated patch attached.

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.10.patch, hadoop-7214.11.patch, hadoop-7214.2.txt, hadoop-7214.3.txt, hadoop-7214.4.txt, hadoop-7214.5.txt, hadoop-7214.6.txt, hadoop-7214.7.patch, hadoop-7214.8.patch, hadoop-7214.9.patch
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Aaron T. Myers (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029130#comment-13029130 ] 

Aaron T. Myers commented on HADOOP-7214:
----------------------------------------

bq. What's the point of the "helper method to get the current user" since it's just wrapping another method directly? Seems silly.

Happy to remove it. It is only a very slight convenience.

bq. We need to make sure we commit this at the same time as we commit the MR and HDFS changes, right? (ie this will break the build if we don't?)

Absolutely. Unless we do what I just suggested in MAPREDUCE-2473, i.e. don't modify the existing {{RefreshUserMappings}} interface, and instead make a separate interface just for fetching.

Let me know if that sounds good to you and I'll go ahead and make those changes.

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.2.txt, hadoop-7214.3.txt, hadoop-7214.4.txt, hadoop-7214.5.txt, hadoop-7214.6.txt, hadoop-7214.7.patch, hadoop-7214.8.patch
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Aaron T. Myers (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Aaron T. Myers updated HADOOP-7214:
-----------------------------------

    Attachment: hadoop-7214.3.txt

Updated patch to address javadoc warning and test failure.

No tests are provided as part of this patch because the tests for this command, like all FsShell commands, are in HDFS.

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.2.txt, hadoop-7214.3.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13016655#comment-13016655 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-7214:
------------------------------------------------

{code}
   static final String TAIL_USAGE="-tail [-f] <file>";
+  static final String GROUPS_USAGE="-groups [USERNAME]...";
   static final String DU_USAGE="-du [-s] [-h] <paths...>";
{code}
{code}
       "[-tail [-f] <path>] [-text <path>]\n\t" +
+      GROUPS_USAGE + "\n\t" +
       "[" + FsShellPermissions.CHMOD_USAGE + "]\n\t" +
{code}
It seems that the new command message is not following the existing convention.

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Todd Lipcon updated HADOOP-7214:
--------------------------------

       Resolution: Fixed
    Fix Version/s: 0.23.0
     Hadoop Flags: [Reviewed]
           Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Aaron!

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>             Fix For: 0.23.0
>
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.10.patch, hadoop-7214.11.patch, hadoop-7214.2.txt, hadoop-7214.3.txt, hadoop-7214.4.txt, hadoop-7214.5.txt, hadoop-7214.6.txt, hadoop-7214.7.patch, hadoop-7214.8.patch, hadoop-7214.9.patch
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Allen Wittenauer (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13014948#comment-13014948 ] 

Allen Wittenauer commented on HADOOP-7214:
------------------------------------------

No, no. I'm thinking in a broader context.

If the naming services have been setup properly, the client, the NN, the JT, the datanodes, etc, will be pulling consistent group definitions from a the same source.  Whether this is files, NIS, LDAP, NetInfo, ...  is irrelevant.  If they aren't consistent, then the operating environment is fundamentally broken.  [In fact, at large scales, they *must* be consistent otherwise all sorts of bad things happen...]

Yes, I'm fully aware that some people don't do that.  But that's a bad practice and well, it has its costs.  The inability to get group information from a server (Hadoop or otherwise) being one of them.  I don't see much value in hacking a solution for something that people shouldn't be doing.  

"It hurts when I do that."

"So don't do that."

FWIW, secured NFS has the exact same scenario.  Group membership is owned by the server, not the client.

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Aaron T. Myers (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13017206#comment-13017206 ] 

Aaron T. Myers commented on HADOOP-7214:
----------------------------------------

bq. I don't think groups is POSIX... which is why it has different output on different platforms. (Even if it was, POSIX rarely, if ever, defines the output format of those commands).

Even if it's not specified by POSIX, it seems like a good idea to make the output similar to the groups command which is most-commonly used by users. I did this in the formatting of the normal output, no reason not to do it in the case of error.

bq. In the case of our groups command, giving a username that isn't the caller's should be privileged.

Why? On every system I've ever used this hasn't been the case, so I don't see why it should be the case here.

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Allen Wittenauer (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13019502#comment-13019502 ] 

Allen Wittenauer commented on HADOOP-7214:
------------------------------------------

Should this option actually be -id instead of -groups?  The reason I ask is because groups(1) is a BSD command and as folks have discovered fairly non-standardized in a variety of ways.  id(1)'s params have a POSIX definition.  It also allows us to add user/uid mapping capability (later?) in a POSIX-compatible way.

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.2.txt, hadoop-7214.3.txt, hadoop-7214.4.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Aaron T. Myers (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Aaron T. Myers updated HADOOP-7214:
-----------------------------------

    Attachment: hadoop-7214.7.patch

Updated patch which switches designs to use a separate protocol.

Note that in this patch I renamed {{RefreshUserMappingsProtocol}} to {{UserGroupMappingProtocol}}. The only other changes in this file are bumping the protocol version ID and adding the {{getGroupsForUser}} method.

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.2.txt, hadoop-7214.3.txt, hadoop-7214.4.txt, hadoop-7214.5.txt, hadoop-7214.6.txt, hadoop-7214.7.patch
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Aaron T. Myers (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Aaron T. Myers updated HADOOP-7214:
-----------------------------------

    Attachment: hadoop-7214.10.patch

Attaching the patch to the right JIRA this time. :P

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.10.patch, hadoop-7214.2.txt, hadoop-7214.3.txt, hadoop-7214.4.txt, hadoop-7214.5.txt, hadoop-7214.6.txt, hadoop-7214.7.patch, hadoop-7214.8.patch, hadoop-7214.9.patch
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020073#comment-13020073 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-7214:
------------------------------------------------

> ... Could you please elaborate? Is the point that you agree that users should be able to determine their group membership, but we should use some other process that isn't the NN to do it?

Yes, it makes sense to let user to determine their group membership.  In case of LDAP, clients should talk to the LDAP server directly.  In case of shell, it may be better to run a separated group server/process for the clients outside the domain.

NN is already a bottleneck of the system.  We don't want to overload it with other functionality.

I think we need some helps here.  Could any security expert comment on this?

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.2.txt, hadoop-7214.3.txt, hadoop-7214.4.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13023060#comment-13023060 ] 

Todd Lipcon commented on HADOOP-7214:
-------------------------------------

Hi folks. It seems there are some concerns over this patch, but no one has actually vetoed it. It seems like most of the concerns fall under the following three buckets:

*Concern 1:* "This command is only useful when a system is misconfigured."

Aaron provided the example of developer laptops connecting to a cluster. Another example we see often is a case where Windows client nodes directly upload files to HDFS without going through a "bastion box" or other proxy -- because of the differing group models between Unix and Windows, it's not necessarily clear (especially to non-ops users) what groups they might have on HDFS. Additionally, we often see people setting up Linux boxes with custom pam/nss modules such as VAS which can have semantics that regular users may not expect.

So, those are several examples of where it's useful even in a well-configured system.

I also agree that it's *very* useful when a system is misconfigured. Unfortunately Hadoop gets deployed in some environments with lax policies or inexperienced operations, and they have no central account management system such as LDAP. Although I agree this is a bad setup, we should endeavor to provide basic tools that help a user diagnose how Hadoop is interacting with their bad setup. Of course, we should also try to educate users on best practices at the same time :)

*Concern 2:* "I don't particularly care about this, since my cluster is well setup, so we shouldn't add it."

In practice, I've seen many support issues (on the mailing lists as well as among our customers) where having this tool would have simplified diagnosis and understanding of a problem. So, though it may not be useful for every deployment, it's useful for some, and shouldn't be blocked on this basis.

*Concern 3:* "Adding this command opens greater security exposure"

While I agree with Aaron that this seems to be picking nits, it seems we can address this by adding a configuration allowing the command to be disabled. 


Are there any other concerns that I'm failing to summarize above? If not, I will proceed to review the implementation of the patch and assume that no one is planning on vetoing.

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.2.txt, hadoop-7214.3.txt, hadoop-7214.4.txt, hadoop-7214.5.txt, hadoop-7214.6.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029176#comment-13029176 ] 

Hadoop QA commented on HADOOP-7214:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12478250/hadoop-7214.9.patch
  against trunk revision 1099633.

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    -1 release audit.  The applied patch generated 2 release audit warnings (more than the trunk's current 1 warnings).

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

    +1 system test framework.  The patch passed system test framework compile.

Test results: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/403//testReport/
Release audit warnings: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/403//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/403//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/403//console

This message is automatically generated.

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.2.txt, hadoop-7214.3.txt, hadoop-7214.4.txt, hadoop-7214.5.txt, hadoop-7214.6.txt, hadoop-7214.7.patch, hadoop-7214.8.patch, hadoop-7214.9.patch
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Aaron T. Myers (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13016093#comment-13016093 ] 

Aaron T. Myers commented on HADOOP-7214:
----------------------------------------

Hey Allen, what conclusion do you draw from this reasoning? That we shouldn't concern ourselves with implementing a way to query the NN and JT independently, since having distinct group resolutions on each is a misconfiguration? Or that we shouldn't provide a way to query either the NN or JT for a user's groups since they should happen to be the same on the client?

I *might* agree with the first conclusion, but can't possible agree with the second. It's perfectly reasonable to have clients which have different sets of users/groups from what's on the servers. For example, at a one organization where I've worked, developers had self-managed laptops, but were permitted to access Hadoop clusters as clients from these laptops. The laptops obviously had different sets of users/groups than what was on these clusters.

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13023293#comment-13023293 ] 

Todd Lipcon commented on HADOOP-7214:
-------------------------------------

Let's go with #3 then. I think alternative #1 is insufficient because the machine accessing HDFS may be in a different administrative realm (eg a Windows server writing to HDFS). Alternative #2 is insufficient for similar reasons (client machine may not have appropriate libraries installed, or access to the group mapping authority). Alternative #3 seems reasonable (we had considered this as well) but it's unclear what the command would look like. Are you OK with adding new command line tools "bin/hdfs groups" and "bin/mapred groups" that use the configured NN or JT by default? Making users play "guess the IPC port" doesn't seem like a good idea (in my experience not even Hadoop developers can keep straight which ports run IPC vs HTTP)

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.2.txt, hadoop-7214.3.txt, hadoop-7214.4.txt, hadoop-7214.5.txt, hadoop-7214.6.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13023358#comment-13023358 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-7214:
------------------------------------------------

> 3) A special UGI protocol ...

#3 sounds good.

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.2.txt, hadoop-7214.3.txt, hadoop-7214.4.txt, hadoop-7214.5.txt, hadoop-7214.6.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Sanjay Radia (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020131#comment-13020131 ] 

Sanjay Radia commented on HADOOP-7214:
--------------------------------------

The main concern I have here is that Hadoop permissions and security was very explicitly designed to NOT manage
user accounts or group accounts. Hadoop uses the accounts (user and group) from the environment in which Hadoop 
is deployed. This has many advantages for deploying and using Hadoop. 
When the authentication system returns a particular user name after authentication it assumes that the name is a valid user. Groups membership for a user are determined  via a plug in.
Given that Hadoop does not manage user accounts or group accounts it is very strange that the NN and JT have a
method that return group membership. Such a method does not seem to belong in the NN's or JT's interface.
It seems that one needs a library that uses the same plugin that the NN or JT uses. The command can call this 
library. Allen's point is correct that one has to correctly configure all Hadoop components to pull the 
membership from the same source.

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.2.txt, hadoop-7214.3.txt, hadoop-7214.4.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Aaron T. Myers (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Aaron T. Myers updated HADOOP-7214:
-----------------------------------

    Attachment:     (was: hadoop-7214.1.txt)

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Todd Lipcon updated HADOOP-7214:
--------------------------------

    Status: Open  (was: Patch Available)

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.2.txt, hadoop-7214.3.txt, hadoop-7214.4.txt, hadoop-7214.5.txt, hadoop-7214.6.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029055#comment-13029055 ] 

Hadoop QA commented on HADOOP-7214:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12478229/hadoop-7214.8.patch
  against trunk revision 1099633.

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

    +1 system test framework.  The patch passed system test framework compile.

Test results: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/401//testReport/
Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/401//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/401//console

This message is automatically generated.

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.2.txt, hadoop-7214.3.txt, hadoop-7214.4.txt, hadoop-7214.5.txt, hadoop-7214.6.txt, hadoop-7214.7.patch, hadoop-7214.8.patch
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Allen Wittenauer (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13017044#comment-13017044 ] 

Allen Wittenauer commented on HADOOP-7214:
------------------------------------------

bq. Make sure it outputs a posix-like error of "groups: foo: no such user".

I don't think groups is POSIX... which is why it has different output on different platforms.  (Even if it was, POSIX rarely, if ever, defines the output format of those commands).

In the case of our groups command, giving a username that isn't the caller's should be privileged.

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Aaron T. Myers (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13017238#comment-13017238 ] 

Aaron T. Myers commented on HADOOP-7214:
----------------------------------------

bq. Because that isn't true on every system that Hadoop may consume. I suspect that we actually have a higher level vulnerability here if the method isn't protected either.

Can you provide an example of such a system? I can say with a high degree of confidence that the authorization semantics of the current patch are the same as that for the vast majority of the systems that Hadoop is currently deployed on.

My inclination is to implement it like this, and if a user of this hypothetical system which you refer to wants it at some point in the future, we can make it a configuration option at the NN. How does that sound?

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Aaron T. Myers (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020221#comment-13020221 ] 

Aaron T. Myers commented on HADOOP-7214:
----------------------------------------

Thanks a lot for the comments, Sanjay.

Group resolution used to be done client-side. When this was the case, calling {{/usr/bin/groups}} was the correct method of determining which groups a user belonged to from HDFS's perspective, because these were indeed the groups that HDFS used to determine file access. With the introduction of Hadoop's security features, group resolution was moved server-side, but provided no method for a user to determine what groups they belong to from Hadoop's perspective. This JIRA seeks to address this deficiency.

Detailed responses inline.

bq. The main concern I have here is that Hadoop permissions and security was very explicitly designed to NOT manage user accounts or group accounts. Hadoop uses the accounts (user and group) from the environment in which Hadoop is deployed. This has many advantages for deploying and using Hadoop.

I completely agree. With the addition of this command, Hadoop is still not _managing_ the user/group mapping - it's just reading them, as it was before. I'm certainly not proposing that we create a command to let a user or admin mutate the user -> group mapping via Hadoop, or to add a Hadoop-specific user/group database. Just to let a user see what groups they belong to from Hadoop's perspective.

I see this as being no different than the way {{/usr/bin/groups}} presently works. By default, the groups on the machine are managed locally via {{/etc/groups}}. Optionally, you can configure a different back-end database (e.g. LDAP, NIS) to provide the user/group mapping for the machine, by editing {{/etc/nsswitch.conf}}, in which case the groups are no longer managed locally. But even if you do this, {{/usr/bin/groups}} will still work.

bq. It seems that one needs a library that uses the same plugin that the NN or JT uses. The command can call this library.

This won't work for the default case, i.e. when the NN/JT are using the {{ShellBasedUnixGroupsMapping}} and a user is interacting with the NN/JT from a remote machine which happens to not have the same user/group mappings as the NN/JT. If the client were to just use this plugin, they would get the user/group mapping on the client machine, not the ones that matter - the ones on the NN/JT.

bq. Allen's point is correct that one has to correctly configure all Hadoop components to pull the membership from the same source.

I agree with this point as well. The question is what qualifies as a "correct" configuration. I claim that having distinct user/group mappings on the client vs. the master servers is a perfectly valid configuration. Sure, it's a little more difficult to reason about, since a user will need to concern themselves with two sets of user/group mappings - local and remote. Part of the reason this is hard to reason about is precisely because a user presently has no way of determining what groups they belong to from Hadoop's perspective.

bq. Thinking a little bit further, one could argue that since Hadoop derives user accounts and membership from 
its environment, one should ask the environment about group membership: that is, use the groups command of 
your environment (typically a unix system).

This won't work for the case I'm concerned about. The user would need to {{ssh}} into the NN or JT (which we usually recommend disabling shell access to) to determine the groups they belong to.

bq. Perhaps the main motivation for Hadoop-groups-command is to detect if Hadoop has been correctly configured 
for the environment.

This is certainly part of the motivation, but by no means the only one. Without a command like this, if a Hadoop admin were attempting to configure a custom {{GroupMappingServiceProvider}}, how would they determine if it were working?

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.2.txt, hadoop-7214.3.txt, hadoop-7214.4.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029525#comment-13029525 ] 

Hadoop QA commented on HADOOP-7214:
-----------------------------------

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12478313/hadoop-7214.11.patch
  against trunk revision 1099633.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 2 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

    +1 system test framework.  The patch passed system test framework compile.

Test results: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/407//testReport/
Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/407//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/407//console

This message is automatically generated.

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.10.patch, hadoop-7214.11.patch, hadoop-7214.2.txt, hadoop-7214.3.txt, hadoop-7214.4.txt, hadoop-7214.5.txt, hadoop-7214.6.txt, hadoop-7214.7.patch, hadoop-7214.8.patch, hadoop-7214.9.patch
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Sanjay Radia (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13023270#comment-13023270 ] 

Sanjay Radia commented on HADOOP-7214:
--------------------------------------

Concern 6: This command does not belong in NN or JT's interface.  
I see three alternatives: 
1) Just run your OS's groups command on the desktop from which you submit your MR job. In a correctly configured cluster, you should get the same answer.
2) Create a new hadoop command that uses the same plugin that the NN or JT uses. This should help
for a poorly configured environment. 
3) A special UGI protocol that gives this functionality. The NN and the JT can implement that protocol.


I can live with 3.

sanjay

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt, hadoop-7214.2.txt, hadoop-7214.3.txt, hadoop-7214.4.txt, hadoop-7214.5.txt, hadoop-7214.6.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7214) Hadoop /usr/bin/groups equivalent

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13017017#comment-13017017 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-7214:
------------------------------------------------

Thanks Aaron.  According to the current convention, I think the usage string probably should be
{code}
GROUPS_USAGE="-groups [<username...>]";
{code}
What do you think?

> Hadoop /usr/bin/groups equivalent
> ---------------------------------
>
>                 Key: HADOOP-7214
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7214
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>         Attachments: hadoop-7214.0.txt, hadoop-7214.1.txt
>
>
> Since user -> groups resolution is done on the NN and JT machines, there should be a way for users to determine what groups they're a member of from the NN's and JT's perspective.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira