You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Andrew Wang (JIRA)" <ji...@apache.org> on 2013/07/23 01:38:49 UTC

[jira] [Updated] (HADOOP-9758) Provide configuration option for FileSystem/FileContext symlink resolution

     [ https://issues.apache.org/jira/browse/HADOOP-9758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Wang updated HADOOP-9758:
--------------------------------

    Description: 
With FileSystem symlink support incoming in HADOOP-8040, some clients will wish to not transparently resolve symlinks. This is somewhat similar to O_NOFOLLOW in open(2).

Rationale for is for a security model where a user can invoke a third-party service running as a service user to operate on the user's data. For instance, users might want to use Hive to query data in their homedirs, where Hive runs as the Hive user and the data is readable by the Hive user. This leads to a security issue with symlinks:

# User Mallory invokes Hive to process data files in {{/user/mallory/hive/}}
# Hive checks permissions on the files in {{/user/mallory/hive/}} and allows the query to proceed.
# RACE: Mallory replaces the files in {{/user/mallory/hive}} with symlinks that point to user Ann's Hive files in {{/user/ann/hive}}. These files aren't readable by Mallory, but she can create whatever symlinks she wants in her own scratch directory.
# Hive's MR jobs happily resolve the symlinks and accesses Ann's private data.

This is also potentially useful for clients using FileContext, so let's add it there too.

  was:
With FileSystem symlink support incoming in HADOOP-8040, some clients will wish to not transparently resolve symlinks. This is somewhat similar to O_NOFOLLOW in open(2).

Rationale for is for a security model where a user can invoke a third-party service running as a service user to operate on the user's data. For instance, users might want to use Hive to query data in their homedirs, where Hive runs as the Hive user and the data is readable by the Hive user. This leads to a security issue with symlinks:

# User Mallory invokes Hive to process data files in {{/user/mallory/hive/}}
# Hive checks permissions on the files in {{/user/mallory/hive/}} and allows the query to proceed.
# RACE: Mallory replaces the files in {{/user/mallory/hive}} with symlinks that point to user Ann's Hive files in {{/user/ann/hive}}. These files aren't readable by Mallory, but she can create whatever symlinks she wants in her own scratch directory.
# Hive's MR jobs happily resolve the symlinks and accesses Ann's private data.

        Summary: Provide configuration option for FileSystem/FileContext symlink resolution  (was: Provide configuration option for FileSystem symlink resolution)
    
> Provide configuration option for FileSystem/FileContext symlink resolution
> --------------------------------------------------------------------------
>
>                 Key: HADOOP-9758
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9758
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 3.0.0, 2.3.0
>            Reporter: Andrew Wang
>            Assignee: Andrew Wang
>         Attachments: hdfs-4968-1.patch, hdfs-4968-2.patch, hdfs-4968-3.patch
>
>
> With FileSystem symlink support incoming in HADOOP-8040, some clients will wish to not transparently resolve symlinks. This is somewhat similar to O_NOFOLLOW in open(2).
> Rationale for is for a security model where a user can invoke a third-party service running as a service user to operate on the user's data. For instance, users might want to use Hive to query data in their homedirs, where Hive runs as the Hive user and the data is readable by the Hive user. This leads to a security issue with symlinks:
> # User Mallory invokes Hive to process data files in {{/user/mallory/hive/}}
> # Hive checks permissions on the files in {{/user/mallory/hive/}} and allows the query to proceed.
> # RACE: Mallory replaces the files in {{/user/mallory/hive}} with symlinks that point to user Ann's Hive files in {{/user/ann/hive}}. These files aren't readable by Mallory, but she can create whatever symlinks she wants in her own scratch directory.
> # Hive's MR jobs happily resolve the symlinks and accesses Ann's private data.
> This is also potentially useful for clients using FileContext, so let's add it there too.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira