You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Kan Zhang (JIRA)" <ji...@apache.org> on 2008/10/07 06:03:44 UTC

[jira] Created: (HADOOP-4359) Support for data access authorization checking on DataNodes

Support for data access authorization checking on DataNodes
-----------------------------------------------------------

                 Key: HADOOP-4359
                 URL: https://issues.apache.org/jira/browse/HADOOP-4359
             Project: Hadoop Core
          Issue Type: New Feature
          Components: dfs
            Reporter: Kan Zhang
            Assignee: Kan Zhang
             Fix For: 0.20.0


Currently, DataNodes do not enforce any access control on accesses to its data blocks. This makes it possible for an unauthorized client to read a data block as long as she can supply its block ID. It's also possible for anyone to write arbitrary data blocks to DataNodes. 

When users request file accesses on the NameNode, file permission checking takes place. Authorization decisions are made with regard to whether the requested accesses to those files (and implicitly, to their corresponding data blocks) are permitted. However, when it comes to subsequent data block accesses on the DataNodes, those authorization decisions are not made available to the DataNodes and consequently, such accesses are not verified. Datanodes are not capable of reaching those decisions independently since they don't have concepts of files, let alone file permissions.

In order to implement data access policies consistently across HDFS services, there is a need for a mechanism by which authorization decisions made on the NameNode can be faithfully enforced on the DataNodes and any unauthorized access is declined.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4359) Support for data access authorization checking on DataNodes

Posted by "Kan Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kan Zhang updated HADOOP-4359:
------------------------------

    Status: Patch Available  (was: Open)

> Support for data access authorization checking on DataNodes
> -----------------------------------------------------------
>
>                 Key: HADOOP-4359
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4359
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Kan Zhang
>            Assignee: Kan Zhang
>         Attachments: at13.patch, at19.patch, at31.patch, at33.patch, at34.patch, at35.patch, at36.patch, at37.patch
>
>
> Currently, DataNodes do not enforce any access control on accesses to its data blocks. This makes it possible for an unauthorized client to read a data block as long as she can supply its block ID. It's also possible for anyone to write arbitrary data blocks to DataNodes. 
> When users request file accesses on the NameNode, file permission checking takes place. Authorization decisions are made with regard to whether the requested accesses to those files (and implicitly, to their corresponding data blocks) are permitted. However, when it comes to subsequent data block accesses on the DataNodes, those authorization decisions are not made available to the DataNodes and consequently, such accesses are not verified. Datanodes are not capable of reaching those decisions independently since they don't have concepts of files, let alone file permissions.
> In order to implement data access policies consistently across HDFS services, there is a need for a mechanism by which authorization decisions made on the NameNode can be faithfully enforced on the DataNodes and any unauthorized access is declined.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4359) Support for data access authorization checking on DataNodes

Posted by "Kan Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kan Zhang updated HADOOP-4359:
------------------------------

    Attachment: at40.patch

> Support for data access authorization checking on DataNodes
> -----------------------------------------------------------
>
>                 Key: HADOOP-4359
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4359
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Kan Zhang
>            Assignee: Kan Zhang
>         Attachments: at13.patch, at19.patch, at31.patch, at33.patch, at34.patch, at35.patch, at36.patch, at37.patch, at38.patch, at39.patch, at40.patch
>
>
> Currently, DataNodes do not enforce any access control on accesses to its data blocks. This makes it possible for an unauthorized client to read a data block as long as she can supply its block ID. It's also possible for anyone to write arbitrary data blocks to DataNodes. 
> When users request file accesses on the NameNode, file permission checking takes place. Authorization decisions are made with regard to whether the requested accesses to those files (and implicitly, to their corresponding data blocks) are permitted. However, when it comes to subsequent data block accesses on the DataNodes, those authorization decisions are not made available to the DataNodes and consequently, such accesses are not verified. Datanodes are not capable of reaching those decisions independently since they don't have concepts of files, let alone file permissions.
> In order to implement data access policies consistently across HDFS services, there is a need for a mechanism by which authorization decisions made on the NameNode can be faithfully enforced on the DataNodes and any unauthorized access is declined.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4359) Support for data access authorization checking on DataNodes

Posted by "Kan Zhang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12638084#action_12638084 ] 

Kan Zhang commented on HADOOP-4359:
-----------------------------------

Yes. I'll post some details later. Do you see any problem with this direction?

> Support for data access authorization checking on DataNodes
> -----------------------------------------------------------
>
>                 Key: HADOOP-4359
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4359
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Kan Zhang
>            Assignee: Kan Zhang
>             Fix For: 0.20.0
>
>
> Currently, DataNodes do not enforce any access control on accesses to its data blocks. This makes it possible for an unauthorized client to read a data block as long as she can supply its block ID. It's also possible for anyone to write arbitrary data blocks to DataNodes. 
> When users request file accesses on the NameNode, file permission checking takes place. Authorization decisions are made with regard to whether the requested accesses to those files (and implicitly, to their corresponding data blocks) are permitted. However, when it comes to subsequent data block accesses on the DataNodes, those authorization decisions are not made available to the DataNodes and consequently, such accesses are not verified. Datanodes are not capable of reaching those decisions independently since they don't have concepts of files, let alone file permissions.
> In order to implement data access policies consistently across HDFS services, there is a need for a mechanism by which authorization decisions made on the NameNode can be faithfully enforced on the DataNodes and any unauthorized access is declined.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4359) Support for data access authorization checking on DataNodes

Posted by "Kan Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kan Zhang updated HADOOP-4359:
------------------------------

    Attachment: AccessTokenDesign1.pdf

> Support for data access authorization checking on DataNodes
> -----------------------------------------------------------
>
>                 Key: HADOOP-4359
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4359
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>    Affects Versions: 0.20.0
>            Reporter: Kan Zhang
>            Assignee: Kan Zhang
>             Fix For: 0.21.0
>
>         Attachments: AccessTokenDesign1.pdf, at13.patch, at19.patch, at31.patch, at33.patch, at34.patch, at35.patch, at36.patch, at37.patch, at38.patch, at39.patch, at40.patch
>
>
> Currently, DataNodes do not enforce any access control on accesses to its data blocks. This makes it possible for an unauthorized client to read a data block as long as she can supply its block ID. It's also possible for anyone to write arbitrary data blocks to DataNodes. 
> When users request file accesses on the NameNode, file permission checking takes place. Authorization decisions are made with regard to whether the requested accesses to those files (and implicitly, to their corresponding data blocks) are permitted. However, when it comes to subsequent data block accesses on the DataNodes, those authorization decisions are not made available to the DataNodes and consequently, such accesses are not verified. Datanodes are not capable of reaching those decisions independently since they don't have concepts of files, let alone file permissions.
> In order to implement data access policies consistently across HDFS services, there is a need for a mechanism by which authorization decisions made on the NameNode can be faithfully enforced on the DataNodes and any unauthorized access is declined.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4359) Support for data access authorization checking on DataNodes

Posted by "Kan Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kan Zhang updated HADOOP-4359:
------------------------------

    Release Note: Introduced access tokens as capabilities for accessing datanodes.
          Status: Patch Available  (was: Open)

> Support for data access authorization checking on DataNodes
> -----------------------------------------------------------
>
>                 Key: HADOOP-4359
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4359
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Kan Zhang
>            Assignee: Kan Zhang
>         Attachments: at13.patch, at19.patch, at31.patch
>
>
> Currently, DataNodes do not enforce any access control on accesses to its data blocks. This makes it possible for an unauthorized client to read a data block as long as she can supply its block ID. It's also possible for anyone to write arbitrary data blocks to DataNodes. 
> When users request file accesses on the NameNode, file permission checking takes place. Authorization decisions are made with regard to whether the requested accesses to those files (and implicitly, to their corresponding data blocks) are permitted. However, when it comes to subsequent data block accesses on the DataNodes, those authorization decisions are not made available to the DataNodes and consequently, such accesses are not verified. Datanodes are not capable of reaching those decisions independently since they don't have concepts of files, let alone file permissions.
> In order to implement data access policies consistently across HDFS services, there is a need for a mechanism by which authorization decisions made on the NameNode can be faithfully enforced on the DataNodes and any unauthorized access is declined.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4359) Support for data access authorization checking on DataNodes

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12705836#action_12705836 ] 

Raghu Angadi commented on HADOOP-4359:
--------------------------------------

+1. 

This went through couple of iterations of review. 

> Support for data access authorization checking on DataNodes
> -----------------------------------------------------------
>
>                 Key: HADOOP-4359
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4359
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Kan Zhang
>            Assignee: Kan Zhang
>         Attachments: at13.patch, at19.patch, at31.patch, at33.patch, at34.patch, at35.patch, at36.patch
>
>
> Currently, DataNodes do not enforce any access control on accesses to its data blocks. This makes it possible for an unauthorized client to read a data block as long as she can supply its block ID. It's also possible for anyone to write arbitrary data blocks to DataNodes. 
> When users request file accesses on the NameNode, file permission checking takes place. Authorization decisions are made with regard to whether the requested accesses to those files (and implicitly, to their corresponding data blocks) are permitted. However, when it comes to subsequent data block accesses on the DataNodes, those authorization decisions are not made available to the DataNodes and consequently, such accesses are not verified. Datanodes are not capable of reaching those decisions independently since they don't have concepts of files, let alone file permissions.
> In order to implement data access policies consistently across HDFS services, there is a need for a mechanism by which authorization decisions made on the NameNode can be faithfully enforced on the DataNodes and any unauthorized access is declined.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4359) Support for data access authorization checking on DataNodes

Posted by "Kan Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kan Zhang updated HADOOP-4359:
------------------------------

    Status: Patch Available  (was: Open)

> Support for data access authorization checking on DataNodes
> -----------------------------------------------------------
>
>                 Key: HADOOP-4359
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4359
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Kan Zhang
>            Assignee: Kan Zhang
>         Attachments: at13.patch, at19.patch, at31.patch, at33.patch, at34.patch, at35.patch, at36.patch, at37.patch, at38.patch, at39.patch
>
>
> Currently, DataNodes do not enforce any access control on accesses to its data blocks. This makes it possible for an unauthorized client to read a data block as long as she can supply its block ID. It's also possible for anyone to write arbitrary data blocks to DataNodes. 
> When users request file accesses on the NameNode, file permission checking takes place. Authorization decisions are made with regard to whether the requested accesses to those files (and implicitly, to their corresponding data blocks) are permitted. However, when it comes to subsequent data block accesses on the DataNodes, those authorization decisions are not made available to the DataNodes and consequently, such accesses are not verified. Datanodes are not capable of reaching those decisions independently since they don't have concepts of files, let alone file permissions.
> In order to implement data access policies consistently across HDFS services, there is a need for a mechanism by which authorization decisions made on the NameNode can be faithfully enforced on the DataNodes and any unauthorized access is declined.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4359) Support for data access authorization checking on DataNodes

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Raghu Angadi updated HADOOP-4359:
---------------------------------

      Resolution: Fixed
    Hadoop Flags: [Incompatible change, Reviewed]
          Status: Resolved  (was: Patch Available)

> Support for data access authorization checking on DataNodes
> -----------------------------------------------------------
>
>                 Key: HADOOP-4359
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4359
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>    Affects Versions: 0.20.0
>            Reporter: Kan Zhang
>            Assignee: Kan Zhang
>         Attachments: at13.patch, at19.patch, at31.patch, at33.patch, at34.patch, at35.patch, at36.patch, at37.patch, at38.patch, at39.patch, at40.patch
>
>
> Currently, DataNodes do not enforce any access control on accesses to its data blocks. This makes it possible for an unauthorized client to read a data block as long as she can supply its block ID. It's also possible for anyone to write arbitrary data blocks to DataNodes. 
> When users request file accesses on the NameNode, file permission checking takes place. Authorization decisions are made with regard to whether the requested accesses to those files (and implicitly, to their corresponding data blocks) are permitted. However, when it comes to subsequent data block accesses on the DataNodes, those authorization decisions are not made available to the DataNodes and consequently, such accesses are not verified. Datanodes are not capable of reaching those decisions independently since they don't have concepts of files, let alone file permissions.
> In order to implement data access policies consistently across HDFS services, there is a need for a mechanism by which authorization decisions made on the NameNode can be faithfully enforced on the DataNodes and any unauthorized access is declined.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4359) Access Token: Support for data access authorization checking on DataNodes

Posted by "Kan Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kan Zhang updated HADOOP-4359:
------------------------------

    Summary: Access Token: Support for data access authorization checking on DataNodes  (was: Support for data access authorization checking on DataNodes)

> Access Token: Support for data access authorization checking on DataNodes
> -------------------------------------------------------------------------
>
>                 Key: HADOOP-4359
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4359
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>    Affects Versions: 0.20.0
>            Reporter: Kan Zhang
>            Assignee: Kan Zhang
>             Fix For: 0.21.0
>
>         Attachments: AccessTokenDesign1.pdf, at13.patch, at19.patch, at31.patch, at33.patch, at34.patch, at35.patch, at36.patch, at37.patch, at38.patch, at39.patch, at40.patch
>
>
> Currently, DataNodes do not enforce any access control on accesses to its data blocks. This makes it possible for an unauthorized client to read a data block as long as she can supply its block ID. It's also possible for anyone to write arbitrary data blocks to DataNodes. 
> When users request file accesses on the NameNode, file permission checking takes place. Authorization decisions are made with regard to whether the requested accesses to those files (and implicitly, to their corresponding data blocks) are permitted. However, when it comes to subsequent data block accesses on the DataNodes, those authorization decisions are not made available to the DataNodes and consequently, such accesses are not verified. Datanodes are not capable of reaching those decisions independently since they don't have concepts of files, let alone file permissions.
> In order to implement data access policies consistently across HDFS services, there is a need for a mechanism by which authorization decisions made on the NameNode can be faithfully enforced on the DataNodes and any unauthorized access is declined.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4359) Support for data access authorization checking on DataNodes

Posted by "Kan Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kan Zhang updated HADOOP-4359:
------------------------------

    Status: Open  (was: Patch Available)

> Support for data access authorization checking on DataNodes
> -----------------------------------------------------------
>
>                 Key: HADOOP-4359
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4359
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Kan Zhang
>            Assignee: Kan Zhang
>         Attachments: at13.patch, at19.patch, at31.patch, at33.patch, at34.patch, at35.patch, at36.patch, at37.patch, at38.patch
>
>
> Currently, DataNodes do not enforce any access control on accesses to its data blocks. This makes it possible for an unauthorized client to read a data block as long as she can supply its block ID. It's also possible for anyone to write arbitrary data blocks to DataNodes. 
> When users request file accesses on the NameNode, file permission checking takes place. Authorization decisions are made with regard to whether the requested accesses to those files (and implicitly, to their corresponding data blocks) are permitted. However, when it comes to subsequent data block accesses on the DataNodes, those authorization decisions are not made available to the DataNodes and consequently, such accesses are not verified. Datanodes are not capable of reaching those decisions independently since they don't have concepts of files, let alone file permissions.
> In order to implement data access policies consistently across HDFS services, there is a need for a mechanism by which authorization decisions made on the NameNode can be faithfully enforced on the DataNodes and any unauthorized access is declined.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4359) Support for data access authorization checking on DataNodes

Posted by "Kan Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kan Zhang updated HADOOP-4359:
------------------------------

    Status: Open  (was: Patch Available)

> Support for data access authorization checking on DataNodes
> -----------------------------------------------------------
>
>                 Key: HADOOP-4359
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4359
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Kan Zhang
>            Assignee: Kan Zhang
>         Attachments: at13.patch, at19.patch, at31.patch, at33.patch
>
>
> Currently, DataNodes do not enforce any access control on accesses to its data blocks. This makes it possible for an unauthorized client to read a data block as long as she can supply its block ID. It's also possible for anyone to write arbitrary data blocks to DataNodes. 
> When users request file accesses on the NameNode, file permission checking takes place. Authorization decisions are made with regard to whether the requested accesses to those files (and implicitly, to their corresponding data blocks) are permitted. However, when it comes to subsequent data block accesses on the DataNodes, those authorization decisions are not made available to the DataNodes and consequently, such accesses are not verified. Datanodes are not capable of reaching those decisions independently since they don't have concepts of files, let alone file permissions.
> In order to implement data access policies consistently across HDFS services, there is a need for a mechanism by which authorization decisions made on the NameNode can be faithfully enforced on the DataNodes and any unauthorized access is declined.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4359) Support for data access authorization checking on DataNodes

Posted by "Kan Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kan Zhang updated HADOOP-4359:
------------------------------

    Status: Open  (was: Patch Available)

> Support for data access authorization checking on DataNodes
> -----------------------------------------------------------
>
>                 Key: HADOOP-4359
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4359
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Kan Zhang
>            Assignee: Kan Zhang
>         Attachments: at13.patch, at19.patch, at31.patch, at33.patch, at34.patch, at35.patch, at36.patch, at37.patch
>
>
> Currently, DataNodes do not enforce any access control on accesses to its data blocks. This makes it possible for an unauthorized client to read a data block as long as she can supply its block ID. It's also possible for anyone to write arbitrary data blocks to DataNodes. 
> When users request file accesses on the NameNode, file permission checking takes place. Authorization decisions are made with regard to whether the requested accesses to those files (and implicitly, to their corresponding data blocks) are permitted. However, when it comes to subsequent data block accesses on the DataNodes, those authorization decisions are not made available to the DataNodes and consequently, such accesses are not verified. Datanodes are not capable of reaching those decisions independently since they don't have concepts of files, let alone file permissions.
> In order to implement data access policies consistently across HDFS services, there is a need for a mechanism by which authorization decisions made on the NameNode can be faithfully enforced on the DataNodes and any unauthorized access is declined.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4359) Access Token: Support for data access authorization checking on DataNodes

Posted by "Kan Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kan Zhang updated HADOOP-4359:
------------------------------

    Attachment:     (was: AccessTokenDesign1.pdf)

> Access Token: Support for data access authorization checking on DataNodes
> -------------------------------------------------------------------------
>
>                 Key: HADOOP-4359
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4359
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>    Affects Versions: 0.20.0
>            Reporter: Kan Zhang
>            Assignee: Kan Zhang
>             Fix For: 0.21.0
>
>         Attachments: AccessTokenDesign1.pdf, at13.patch, at19.patch, at31.patch, at33.patch, at34.patch, at35.patch, at36.patch, at37.patch, at38.patch, at39.patch, at40.patch
>
>
> Currently, DataNodes do not enforce any access control on accesses to its data blocks. This makes it possible for an unauthorized client to read a data block as long as she can supply its block ID. It's also possible for anyone to write arbitrary data blocks to DataNodes. 
> When users request file accesses on the NameNode, file permission checking takes place. Authorization decisions are made with regard to whether the requested accesses to those files (and implicitly, to their corresponding data blocks) are permitted. However, when it comes to subsequent data block accesses on the DataNodes, those authorization decisions are not made available to the DataNodes and consequently, such accesses are not verified. Datanodes are not capable of reaching those decisions independently since they don't have concepts of files, let alone file permissions.
> In order to implement data access policies consistently across HDFS services, there is a need for a mechanism by which authorization decisions made on the NameNode can be faithfully enforced on the DataNodes and any unauthorized access is declined.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4359) Support for data access authorization checking on DataNodes

Posted by "Kan Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kan Zhang updated HADOOP-4359:
------------------------------

    Attachment: at34.patch

> Support for data access authorization checking on DataNodes
> -----------------------------------------------------------
>
>                 Key: HADOOP-4359
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4359
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Kan Zhang
>            Assignee: Kan Zhang
>         Attachments: at13.patch, at19.patch, at31.patch, at33.patch, at34.patch
>
>
> Currently, DataNodes do not enforce any access control on accesses to its data blocks. This makes it possible for an unauthorized client to read a data block as long as she can supply its block ID. It's also possible for anyone to write arbitrary data blocks to DataNodes. 
> When users request file accesses on the NameNode, file permission checking takes place. Authorization decisions are made with regard to whether the requested accesses to those files (and implicitly, to their corresponding data blocks) are permitted. However, when it comes to subsequent data block accesses on the DataNodes, those authorization decisions are not made available to the DataNodes and consequently, such accesses are not verified. Datanodes are not capable of reaching those decisions independently since they don't have concepts of files, let alone file permissions.
> In order to implement data access policies consistently across HDFS services, there is a need for a mechanism by which authorization decisions made on the NameNode can be faithfully enforced on the DataNodes and any unauthorized access is declined.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4359) Support for data access authorization checking on DataNodes

Posted by "Kan Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kan Zhang updated HADOOP-4359:
------------------------------

    Attachment: at33.patch

> Support for data access authorization checking on DataNodes
> -----------------------------------------------------------
>
>                 Key: HADOOP-4359
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4359
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Kan Zhang
>            Assignee: Kan Zhang
>         Attachments: at13.patch, at19.patch, at31.patch, at33.patch
>
>
> Currently, DataNodes do not enforce any access control on accesses to its data blocks. This makes it possible for an unauthorized client to read a data block as long as she can supply its block ID. It's also possible for anyone to write arbitrary data blocks to DataNodes. 
> When users request file accesses on the NameNode, file permission checking takes place. Authorization decisions are made with regard to whether the requested accesses to those files (and implicitly, to their corresponding data blocks) are permitted. However, when it comes to subsequent data block accesses on the DataNodes, those authorization decisions are not made available to the DataNodes and consequently, such accesses are not verified. Datanodes are not capable of reaching those decisions independently since they don't have concepts of files, let alone file permissions.
> In order to implement data access policies consistently across HDFS services, there is a need for a mechanism by which authorization decisions made on the NameNode can be faithfully enforced on the DataNodes and any unauthorized access is declined.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4359) Support for data access authorization checking on DataNodes

Posted by "Kan Zhang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12708157#action_12708157 ] 

Kan Zhang commented on HADOOP-4359:
-----------------------------------

uploading a new patch to keep up with trunk changes. no functional change.

> Support for data access authorization checking on DataNodes
> -----------------------------------------------------------
>
>                 Key: HADOOP-4359
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4359
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Kan Zhang
>            Assignee: Kan Zhang
>         Attachments: at13.patch, at19.patch, at31.patch, at33.patch, at34.patch, at35.patch, at36.patch, at37.patch, at38.patch
>
>
> Currently, DataNodes do not enforce any access control on accesses to its data blocks. This makes it possible for an unauthorized client to read a data block as long as she can supply its block ID. It's also possible for anyone to write arbitrary data blocks to DataNodes. 
> When users request file accesses on the NameNode, file permission checking takes place. Authorization decisions are made with regard to whether the requested accesses to those files (and implicitly, to their corresponding data blocks) are permitted. However, when it comes to subsequent data block accesses on the DataNodes, those authorization decisions are not made available to the DataNodes and consequently, such accesses are not verified. Datanodes are not capable of reaching those decisions independently since they don't have concepts of files, let alone file permissions.
> In order to implement data access policies consistently across HDFS services, there is a need for a mechanism by which authorization decisions made on the NameNode can be faithfully enforced on the DataNodes and any unauthorized access is declined.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4359) Support for data access authorization checking on DataNodes

Posted by "Kan Zhang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12707183#action_12707183 ] 

Kan Zhang commented on HADOOP-4359:
-----------------------------------

Uploaded a new patch to match current trunk. No functional change.

> Support for data access authorization checking on DataNodes
> -----------------------------------------------------------
>
>                 Key: HADOOP-4359
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4359
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Kan Zhang
>            Assignee: Kan Zhang
>         Attachments: at13.patch, at19.patch, at31.patch, at33.patch, at34.patch, at35.patch, at36.patch, at37.patch
>
>
> Currently, DataNodes do not enforce any access control on accesses to its data blocks. This makes it possible for an unauthorized client to read a data block as long as she can supply its block ID. It's also possible for anyone to write arbitrary data blocks to DataNodes. 
> When users request file accesses on the NameNode, file permission checking takes place. Authorization decisions are made with regard to whether the requested accesses to those files (and implicitly, to their corresponding data blocks) are permitted. However, when it comes to subsequent data block accesses on the DataNodes, those authorization decisions are not made available to the DataNodes and consequently, such accesses are not verified. Datanodes are not capable of reaching those decisions independently since they don't have concepts of files, let alone file permissions.
> In order to implement data access policies consistently across HDFS services, there is a need for a mechanism by which authorization decisions made on the NameNode can be faithfully enforced on the DataNodes and any unauthorized access is declined.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4359) Support for data access authorization checking on DataNodes

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12708656#action_12708656 ] 

Hadoop QA commented on HADOOP-4359:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12407801/at38.patch
  against trunk revision 774018.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 15 new or modified tests.

    -1 patch.  The patch command could not apply the patch.

Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/328/console

This message is automatically generated.

> Support for data access authorization checking on DataNodes
> -----------------------------------------------------------
>
>                 Key: HADOOP-4359
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4359
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Kan Zhang
>            Assignee: Kan Zhang
>         Attachments: at13.patch, at19.patch, at31.patch, at33.patch, at34.patch, at35.patch, at36.patch, at37.patch, at38.patch
>
>
> Currently, DataNodes do not enforce any access control on accesses to its data blocks. This makes it possible for an unauthorized client to read a data block as long as she can supply its block ID. It's also possible for anyone to write arbitrary data blocks to DataNodes. 
> When users request file accesses on the NameNode, file permission checking takes place. Authorization decisions are made with regard to whether the requested accesses to those files (and implicitly, to their corresponding data blocks) are permitted. However, when it comes to subsequent data block accesses on the DataNodes, those authorization decisions are not made available to the DataNodes and consequently, such accesses are not verified. Datanodes are not capable of reaching those decisions independently since they don't have concepts of files, let alone file permissions.
> In order to implement data access policies consistently across HDFS services, there is a need for a mechanism by which authorization decisions made on the NameNode can be faithfully enforced on the DataNodes and any unauthorized access is declined.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4359) Support for data access authorization checking on DataNodes

Posted by "Kan Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kan Zhang updated HADOOP-4359:
------------------------------

    Attachment: at13.patch

> Support for data access authorization checking on DataNodes
> -----------------------------------------------------------
>
>                 Key: HADOOP-4359
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4359
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Kan Zhang
>            Assignee: Kan Zhang
>         Attachments: at13.patch
>
>
> Currently, DataNodes do not enforce any access control on accesses to its data blocks. This makes it possible for an unauthorized client to read a data block as long as she can supply its block ID. It's also possible for anyone to write arbitrary data blocks to DataNodes. 
> When users request file accesses on the NameNode, file permission checking takes place. Authorization decisions are made with regard to whether the requested accesses to those files (and implicitly, to their corresponding data blocks) are permitted. However, when it comes to subsequent data block accesses on the DataNodes, those authorization decisions are not made available to the DataNodes and consequently, such accesses are not verified. Datanodes are not capable of reaching those decisions independently since they don't have concepts of files, let alone file permissions.
> In order to implement data access policies consistently across HDFS services, there is a need for a mechanism by which authorization decisions made on the NameNode can be faithfully enforced on the DataNodes and any unauthorized access is declined.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4359) Support for data access authorization checking on DataNodes

Posted by "Kan Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kan Zhang updated HADOOP-4359:
------------------------------

    Status: Patch Available  (was: Open)

> Support for data access authorization checking on DataNodes
> -----------------------------------------------------------
>
>                 Key: HADOOP-4359
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4359
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Kan Zhang
>            Assignee: Kan Zhang
>         Attachments: at13.patch, at19.patch, at31.patch, at33.patch, at34.patch, at35.patch, at36.patch, at37.patch, at38.patch
>
>
> Currently, DataNodes do not enforce any access control on accesses to its data blocks. This makes it possible for an unauthorized client to read a data block as long as she can supply its block ID. It's also possible for anyone to write arbitrary data blocks to DataNodes. 
> When users request file accesses on the NameNode, file permission checking takes place. Authorization decisions are made with regard to whether the requested accesses to those files (and implicitly, to their corresponding data blocks) are permitted. However, when it comes to subsequent data block accesses on the DataNodes, those authorization decisions are not made available to the DataNodes and consequently, such accesses are not verified. Datanodes are not capable of reaching those decisions independently since they don't have concepts of files, let alone file permissions.
> In order to implement data access policies consistently across HDFS services, there is a need for a mechanism by which authorization decisions made on the NameNode can be faithfully enforced on the DataNodes and any unauthorized access is declined.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4359) Support for data access authorization checking on DataNodes

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12709133#action_12709133 ] 

Raghu Angadi commented on HADOOP-4359:
--------------------------------------

I just committed this. Thanks Kan!

> Support for data access authorization checking on DataNodes
> -----------------------------------------------------------
>
>                 Key: HADOOP-4359
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4359
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Kan Zhang
>            Assignee: Kan Zhang
>         Attachments: at13.patch, at19.patch, at31.patch, at33.patch, at34.patch, at35.patch, at36.patch, at37.patch, at38.patch, at39.patch, at40.patch
>
>
> Currently, DataNodes do not enforce any access control on accesses to its data blocks. This makes it possible for an unauthorized client to read a data block as long as she can supply its block ID. It's also possible for anyone to write arbitrary data blocks to DataNodes. 
> When users request file accesses on the NameNode, file permission checking takes place. Authorization decisions are made with regard to whether the requested accesses to those files (and implicitly, to their corresponding data blocks) are permitted. However, when it comes to subsequent data block accesses on the DataNodes, those authorization decisions are not made available to the DataNodes and consequently, such accesses are not verified. Datanodes are not capable of reaching those decisions independently since they don't have concepts of files, let alone file permissions.
> In order to implement data access policies consistently across HDFS services, there is a need for a mechanism by which authorization decisions made on the NameNode can be faithfully enforced on the DataNodes and any unauthorized access is declined.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4359) Support for data access authorization checking on DataNodes

Posted by "Tsz Wo (Nicholas), SZE (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12638077#action_12638077 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-4359:
------------------------------------------------

Are you trying to do *authorization without authentication*?  This seems to me theoretically impossible.  Could you explain your design more?

> Support for data access authorization checking on DataNodes
> -----------------------------------------------------------
>
>                 Key: HADOOP-4359
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4359
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Kan Zhang
>            Assignee: Kan Zhang
>             Fix For: 0.20.0
>
>
> Currently, DataNodes do not enforce any access control on accesses to its data blocks. This makes it possible for an unauthorized client to read a data block as long as she can supply its block ID. It's also possible for anyone to write arbitrary data blocks to DataNodes. 
> When users request file accesses on the NameNode, file permission checking takes place. Authorization decisions are made with regard to whether the requested accesses to those files (and implicitly, to their corresponding data blocks) are permitted. However, when it comes to subsequent data block accesses on the DataNodes, those authorization decisions are not made available to the DataNodes and consequently, such accesses are not verified. Datanodes are not capable of reaching those decisions independently since they don't have concepts of files, let alone file permissions.
> In order to implement data access policies consistently across HDFS services, there is a need for a mechanism by which authorization decisions made on the NameNode can be faithfully enforced on the DataNodes and any unauthorized access is declined.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4359) Support for data access authorization checking on DataNodes

Posted by "Kan Zhang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12638065#action_12638065 ] 

Kan Zhang commented on HADOOP-4359:
-----------------------------------

> If we're using Kerberos for authentication, we might use tickets whose authorization data contains a list of block ids.

Yes, that is an option. However, I think it's better for Hadoop to do it in an authentication independent way. 1) Not being tied to Kerberos tickets allows us to accommodate other authentication mechanisms if needed. 2) It gives us flexibility and we are not constrained by the peculiarities of Kerberos implementation, such as authorization field size, expiration and renewal requirements, etc. More importantly, the issuing of authorization tickets can be done at a more convenient time after authentication and without the dependency on Kerberos KDC.

> Support for data access authorization checking on DataNodes
> -----------------------------------------------------------
>
>                 Key: HADOOP-4359
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4359
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Kan Zhang
>            Assignee: Kan Zhang
>             Fix For: 0.20.0
>
>
> Currently, DataNodes do not enforce any access control on accesses to its data blocks. This makes it possible for an unauthorized client to read a data block as long as she can supply its block ID. It's also possible for anyone to write arbitrary data blocks to DataNodes. 
> When users request file accesses on the NameNode, file permission checking takes place. Authorization decisions are made with regard to whether the requested accesses to those files (and implicitly, to their corresponding data blocks) are permitted. However, when it comes to subsequent data block accesses on the DataNodes, those authorization decisions are not made available to the DataNodes and consequently, such accesses are not verified. Datanodes are not capable of reaching those decisions independently since they don't have concepts of files, let alone file permissions.
> In order to implement data access policies consistently across HDFS services, there is a need for a mechanism by which authorization decisions made on the NameNode can be faithfully enforced on the DataNodes and any unauthorized access is declined.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4359) Support for data access authorization checking on DataNodes

Posted by "Kan Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kan Zhang updated HADOOP-4359:
------------------------------

    Status: Open  (was: Patch Available)

> Support for data access authorization checking on DataNodes
> -----------------------------------------------------------
>
>                 Key: HADOOP-4359
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4359
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Kan Zhang
>            Assignee: Kan Zhang
>         Attachments: at13.patch, at19.patch, at31.patch, at33.patch, at34.patch, at35.patch, at36.patch, at37.patch, at38.patch, at39.patch
>
>
> Currently, DataNodes do not enforce any access control on accesses to its data blocks. This makes it possible for an unauthorized client to read a data block as long as she can supply its block ID. It's also possible for anyone to write arbitrary data blocks to DataNodes. 
> When users request file accesses on the NameNode, file permission checking takes place. Authorization decisions are made with regard to whether the requested accesses to those files (and implicitly, to their corresponding data blocks) are permitted. However, when it comes to subsequent data block accesses on the DataNodes, those authorization decisions are not made available to the DataNodes and consequently, such accesses are not verified. Datanodes are not capable of reaching those decisions independently since they don't have concepts of files, let alone file permissions.
> In order to implement data access policies consistently across HDFS services, there is a need for a mechanism by which authorization decisions made on the NameNode can be faithfully enforced on the DataNodes and any unauthorized access is declined.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4359) Support for data access authorization checking on DataNodes

Posted by "Kan Zhang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12638093#action_12638093 ] 

Kan Zhang commented on HADOOP-4359:
-----------------------------------

> Are you trying to do authorization without authentication? 
No. There are 3 parties here. Authentication is done at NN. When the client comes to DN, all DN needs to know is that the requested operation has been authorized by NN.  

> This seems to me theoretically impossible.
Check out transferable capability tokens.


> Support for data access authorization checking on DataNodes
> -----------------------------------------------------------
>
>                 Key: HADOOP-4359
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4359
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Kan Zhang
>            Assignee: Kan Zhang
>             Fix For: 0.20.0
>
>
> Currently, DataNodes do not enforce any access control on accesses to its data blocks. This makes it possible for an unauthorized client to read a data block as long as she can supply its block ID. It's also possible for anyone to write arbitrary data blocks to DataNodes. 
> When users request file accesses on the NameNode, file permission checking takes place. Authorization decisions are made with regard to whether the requested accesses to those files (and implicitly, to their corresponding data blocks) are permitted. However, when it comes to subsequent data block accesses on the DataNodes, those authorization decisions are not made available to the DataNodes and consequently, such accesses are not verified. Datanodes are not capable of reaching those decisions independently since they don't have concepts of files, let alone file permissions.
> In order to implement data access policies consistently across HDFS services, there is a need for a mechanism by which authorization decisions made on the NameNode can be faithfully enforced on the DataNodes and any unauthorized access is declined.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4359) Support for data access authorization checking on DataNodes

Posted by "Kan Zhang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12697632#action_12697632 ] 

Kan Zhang commented on HADOOP-4359:
-----------------------------------

I uploaded a preliminary patch to get some early reviews. It's not complete. In particular, only READ and WRITE operations are changed to use access tokens for now and I have yet to add unit tests. But it should give you a fairly good idea of how access tokens are to be used.

> Support for data access authorization checking on DataNodes
> -----------------------------------------------------------
>
>                 Key: HADOOP-4359
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4359
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Kan Zhang
>            Assignee: Kan Zhang
>         Attachments: at13.patch
>
>
> Currently, DataNodes do not enforce any access control on accesses to its data blocks. This makes it possible for an unauthorized client to read a data block as long as she can supply its block ID. It's also possible for anyone to write arbitrary data blocks to DataNodes. 
> When users request file accesses on the NameNode, file permission checking takes place. Authorization decisions are made with regard to whether the requested accesses to those files (and implicitly, to their corresponding data blocks) are permitted. However, when it comes to subsequent data block accesses on the DataNodes, those authorization decisions are not made available to the DataNodes and consequently, such accesses are not verified. Datanodes are not capable of reaching those decisions independently since they don't have concepts of files, let alone file permissions.
> In order to implement data access policies consistently across HDFS services, there is a need for a mechanism by which authorization decisions made on the NameNode can be faithfully enforced on the DataNodes and any unauthorized access is declined.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4359) Support for data access authorization checking on DataNodes

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12638075#action_12638075 ] 

Doug Cutting commented on HADOOP-4359:
--------------------------------------

> I think it's better for Hadoop to do it in an authentication independent way.

Okay.  So clients would still probably get some sort of signed ticket from the Namenode that encodes the blocks which may be accessed, along with a timeout, etc.  They'd pass this to datanodes with block requests, and the datanode would validate its signature before using it.  Is something like that what you have in mind?


> Support for data access authorization checking on DataNodes
> -----------------------------------------------------------
>
>                 Key: HADOOP-4359
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4359
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Kan Zhang
>            Assignee: Kan Zhang
>             Fix For: 0.20.0
>
>
> Currently, DataNodes do not enforce any access control on accesses to its data blocks. This makes it possible for an unauthorized client to read a data block as long as she can supply its block ID. It's also possible for anyone to write arbitrary data blocks to DataNodes. 
> When users request file accesses on the NameNode, file permission checking takes place. Authorization decisions are made with regard to whether the requested accesses to those files (and implicitly, to their corresponding data blocks) are permitted. However, when it comes to subsequent data block accesses on the DataNodes, those authorization decisions are not made available to the DataNodes and consequently, such accesses are not verified. Datanodes are not capable of reaching those decisions independently since they don't have concepts of files, let alone file permissions.
> In order to implement data access policies consistently across HDFS services, there is a need for a mechanism by which authorization decisions made on the NameNode can be faithfully enforced on the DataNodes and any unauthorized access is declined.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4359) Support for data access authorization checking on DataNodes

Posted by "Kan Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kan Zhang updated HADOOP-4359:
------------------------------

    Attachment: at38.patch

> Support for data access authorization checking on DataNodes
> -----------------------------------------------------------
>
>                 Key: HADOOP-4359
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4359
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Kan Zhang
>            Assignee: Kan Zhang
>         Attachments: at13.patch, at19.patch, at31.patch, at33.patch, at34.patch, at35.patch, at36.patch, at37.patch, at38.patch
>
>
> Currently, DataNodes do not enforce any access control on accesses to its data blocks. This makes it possible for an unauthorized client to read a data block as long as she can supply its block ID. It's also possible for anyone to write arbitrary data blocks to DataNodes. 
> When users request file accesses on the NameNode, file permission checking takes place. Authorization decisions are made with regard to whether the requested accesses to those files (and implicitly, to their corresponding data blocks) are permitted. However, when it comes to subsequent data block accesses on the DataNodes, those authorization decisions are not made available to the DataNodes and consequently, such accesses are not verified. Datanodes are not capable of reaching those decisions independently since they don't have concepts of files, let alone file permissions.
> In order to implement data access policies consistently across HDFS services, there is a need for a mechanism by which authorization decisions made on the NameNode can be faithfully enforced on the DataNodes and any unauthorized access is declined.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (HADOOP-4359) Support for data access authorization checking on DataNodes

Posted by "Kan Zhang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12638084#action_12638084 ] 

kzhang edited comment on HADOOP-4359 at 10/8/08 2:21 PM:
------------------------------------------------------------

> Is something like that what you have in mind?

Yes. I'll post some details later. Do you see any problem with this direction?

      was (Author: kzhang):
    Yes. I'll post some details later. Do you see any problem with this direction?
  
> Support for data access authorization checking on DataNodes
> -----------------------------------------------------------
>
>                 Key: HADOOP-4359
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4359
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Kan Zhang
>            Assignee: Kan Zhang
>             Fix For: 0.20.0
>
>
> Currently, DataNodes do not enforce any access control on accesses to its data blocks. This makes it possible for an unauthorized client to read a data block as long as she can supply its block ID. It's also possible for anyone to write arbitrary data blocks to DataNodes. 
> When users request file accesses on the NameNode, file permission checking takes place. Authorization decisions are made with regard to whether the requested accesses to those files (and implicitly, to their corresponding data blocks) are permitted. However, when it comes to subsequent data block accesses on the DataNodes, those authorization decisions are not made available to the DataNodes and consequently, such accesses are not verified. Datanodes are not capable of reaching those decisions independently since they don't have concepts of files, let alone file permissions.
> In order to implement data access policies consistently across HDFS services, there is a need for a mechanism by which authorization decisions made on the NameNode can be faithfully enforced on the DataNodes and any unauthorized access is declined.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4359) Support for data access authorization checking on DataNodes

Posted by "Kan Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kan Zhang updated HADOOP-4359:
------------------------------

    Status: Patch Available  (was: Open)

> Support for data access authorization checking on DataNodes
> -----------------------------------------------------------
>
>                 Key: HADOOP-4359
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4359
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Kan Zhang
>            Assignee: Kan Zhang
>         Attachments: at13.patch, at19.patch, at31.patch, at33.patch
>
>
> Currently, DataNodes do not enforce any access control on accesses to its data blocks. This makes it possible for an unauthorized client to read a data block as long as she can supply its block ID. It's also possible for anyone to write arbitrary data blocks to DataNodes. 
> When users request file accesses on the NameNode, file permission checking takes place. Authorization decisions are made with regard to whether the requested accesses to those files (and implicitly, to their corresponding data blocks) are permitted. However, when it comes to subsequent data block accesses on the DataNodes, those authorization decisions are not made available to the DataNodes and consequently, such accesses are not verified. Datanodes are not capable of reaching those decisions independently since they don't have concepts of files, let alone file permissions.
> In order to implement data access policies consistently across HDFS services, there is a need for a mechanism by which authorization decisions made on the NameNode can be faithfully enforced on the DataNodes and any unauthorized access is declined.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4359) Support for data access authorization checking on DataNodes

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12704530#action_12704530 ] 

Hadoop QA commented on HADOOP-4359:
-----------------------------------

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12406807/at36.patch
  against trunk revision 770044.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 15 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/263/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/263/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/263/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/263/console

This message is automatically generated.

> Support for data access authorization checking on DataNodes
> -----------------------------------------------------------
>
>                 Key: HADOOP-4359
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4359
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Kan Zhang
>            Assignee: Kan Zhang
>         Attachments: at13.patch, at19.patch, at31.patch, at33.patch, at34.patch, at35.patch, at36.patch
>
>
> Currently, DataNodes do not enforce any access control on accesses to its data blocks. This makes it possible for an unauthorized client to read a data block as long as she can supply its block ID. It's also possible for anyone to write arbitrary data blocks to DataNodes. 
> When users request file accesses on the NameNode, file permission checking takes place. Authorization decisions are made with regard to whether the requested accesses to those files (and implicitly, to their corresponding data blocks) are permitted. However, when it comes to subsequent data block accesses on the DataNodes, those authorization decisions are not made available to the DataNodes and consequently, such accesses are not verified. Datanodes are not capable of reaching those decisions independently since they don't have concepts of files, let alone file permissions.
> In order to implement data access policies consistently across HDFS services, there is a need for a mechanism by which authorization decisions made on the NameNode can be faithfully enforced on the DataNodes and any unauthorized access is declined.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4359) Support for data access authorization checking on DataNodes

Posted by "Kan Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kan Zhang updated HADOOP-4359:
------------------------------

    Attachment: at31.patch

> Support for data access authorization checking on DataNodes
> -----------------------------------------------------------
>
>                 Key: HADOOP-4359
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4359
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Kan Zhang
>            Assignee: Kan Zhang
>         Attachments: at13.patch, at19.patch, at31.patch
>
>
> Currently, DataNodes do not enforce any access control on accesses to its data blocks. This makes it possible for an unauthorized client to read a data block as long as she can supply its block ID. It's also possible for anyone to write arbitrary data blocks to DataNodes. 
> When users request file accesses on the NameNode, file permission checking takes place. Authorization decisions are made with regard to whether the requested accesses to those files (and implicitly, to their corresponding data blocks) are permitted. However, when it comes to subsequent data block accesses on the DataNodes, those authorization decisions are not made available to the DataNodes and consequently, such accesses are not verified. Datanodes are not capable of reaching those decisions independently since they don't have concepts of files, let alone file permissions.
> In order to implement data access policies consistently across HDFS services, there is a need for a mechanism by which authorization decisions made on the NameNode can be faithfully enforced on the DataNodes and any unauthorized access is declined.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4359) Support for data access authorization checking on DataNodes

Posted by "Kan Zhang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12703847#action_12703847 ] 

Kan Zhang commented on HADOOP-4359:
-----------------------------------

fixed Findbugs warnings and re-submitting.

> Support for data access authorization checking on DataNodes
> -----------------------------------------------------------
>
>                 Key: HADOOP-4359
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4359
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Kan Zhang
>            Assignee: Kan Zhang
>         Attachments: at13.patch, at19.patch, at31.patch, at33.patch
>
>
> Currently, DataNodes do not enforce any access control on accesses to its data blocks. This makes it possible for an unauthorized client to read a data block as long as she can supply its block ID. It's also possible for anyone to write arbitrary data blocks to DataNodes. 
> When users request file accesses on the NameNode, file permission checking takes place. Authorization decisions are made with regard to whether the requested accesses to those files (and implicitly, to their corresponding data blocks) are permitted. However, when it comes to subsequent data block accesses on the DataNodes, those authorization decisions are not made available to the DataNodes and consequently, such accesses are not verified. Datanodes are not capable of reaching those decisions independently since they don't have concepts of files, let alone file permissions.
> In order to implement data access policies consistently across HDFS services, there is a need for a mechanism by which authorization decisions made on the NameNode can be faithfully enforced on the DataNodes and any unauthorized access is declined.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4359) Support for data access authorization checking on DataNodes

Posted by "Kan Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kan Zhang updated HADOOP-4359:
------------------------------

    Attachment: at37.patch

> Support for data access authorization checking on DataNodes
> -----------------------------------------------------------
>
>                 Key: HADOOP-4359
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4359
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Kan Zhang
>            Assignee: Kan Zhang
>         Attachments: at13.patch, at19.patch, at31.patch, at33.patch, at34.patch, at35.patch, at36.patch, at37.patch
>
>
> Currently, DataNodes do not enforce any access control on accesses to its data blocks. This makes it possible for an unauthorized client to read a data block as long as she can supply its block ID. It's also possible for anyone to write arbitrary data blocks to DataNodes. 
> When users request file accesses on the NameNode, file permission checking takes place. Authorization decisions are made with regard to whether the requested accesses to those files (and implicitly, to their corresponding data blocks) are permitted. However, when it comes to subsequent data block accesses on the DataNodes, those authorization decisions are not made available to the DataNodes and consequently, such accesses are not verified. Datanodes are not capable of reaching those decisions independently since they don't have concepts of files, let alone file permissions.
> In order to implement data access policies consistently across HDFS services, there is a need for a mechanism by which authorization decisions made on the NameNode can be faithfully enforced on the DataNodes and any unauthorized access is declined.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4359) Support for data access authorization checking on DataNodes

Posted by "Kan Zhang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12655097#action_12655097 ] 

Kan Zhang commented on HADOOP-4359:
-----------------------------------

I plan to introduce an HDFS token, called Access Token, as a vehicle to pass data access authorization information from NN to DN. One can think of Access Tokens as capabilities; an Access Token enables its owner to access certain data blocks. It is issued by NN and used on DN. Access Tokens should be generated in such a way that their authenticity can be verified by DN.

In general, tokens can be generated in 2 ways. A) Using a public-key scheme, where NN chooses a pair of private/public keys and uses the private key to sign a token. The signature becomes an integral part of the token. DN is given NN's public key, which can be used to verify the signature associated with a token. Since only the NN knows the private key, only the NN can generate a valid token. B) Using a symmetric key scheme, where NN and all DNs share a secret key. For each token, the NN computes a keyed hash (also known as message authentication code or MAC) as the token authenticator. The token authenticator becomes an integral part of the token. When a DN receives a token, it uses its copy of the secret key to re-compute the token authenticator and compares it with the one submitted as part of the token. If they match, the token is verified as authentic. Since only NN and DNs know the key (DNs are trusted to never issue tokens; they only use the key to verify tokens they receive), no third party can forge tokens. Method A has the advantage that DN doesn't have to store any secret key and it provides stronger security in the sense that even if a DN is compromised, the attacker still can't forge tokens. However, generating and verifying public-key signatures are expensive compared to symmetric key operations. I plan to use method B to generate Access Tokens.

Access Tokens are ideally non-transferable, i.e., only the owner can use it. This means we don't have to worry if a token gets stolen, for example during transit. One way to make it non-transferable is to include the owner's id in the token and require whoever uses the token to authenticate herself as the owner specified in the token. I plan to simply include the owner's id in the token for now and DN doesn't verify it. Authentication and verification of owner id can be added later if needed.

Access Tokens are meant to be lightweight and short-lived. No need to renew or revoke an Access Token. When a cached Access Token expires, simply get a new one. Access Tokens should be cached only in memory and never written to disk. A typical use case is as follows. A HDFS client asks NN for block ids/locations for a file. NN verifies that the client is authorized to access the file and sends back block ids/locations along with an Access Token for each block. Whenever the HDFS client needs to access a block, it sends the block id along with its associated Access Token to a DN. DN verifies the Access Token before allowing access to the block. The HDFS client may cache Access Tokens received from NN in memory and only get new tokens from NN when the cached ones expire or accessing non-cached blocks.

An Access Token will look like the following, where access mode can be read, write, replicate, etc.
    TokenID = {expirationDate, ownerID, blockID, accessModes}
    TokenAuthenticator = HMAC(key, TokenID)
    Access Token = {TokenID, TokenAuthenticator}

An Access Token is valid on all DNs regardless where the data block is actually stored. The secret key used to compute token authenticator is randomly chosen by the NN and sent to DNs when they first register with the NN. There is a key rolling mechanism that updates this key on NN and pushes the new key to DNs at regular intervals.

> Support for data access authorization checking on DataNodes
> -----------------------------------------------------------
>
>                 Key: HADOOP-4359
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4359
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Kan Zhang
>            Assignee: Kan Zhang
>             Fix For: 0.20.0
>
>
> Currently, DataNodes do not enforce any access control on accesses to its data blocks. This makes it possible for an unauthorized client to read a data block as long as she can supply its block ID. It's also possible for anyone to write arbitrary data blocks to DataNodes. 
> When users request file accesses on the NameNode, file permission checking takes place. Authorization decisions are made with regard to whether the requested accesses to those files (and implicitly, to their corresponding data blocks) are permitted. However, when it comes to subsequent data block accesses on the DataNodes, those authorization decisions are not made available to the DataNodes and consequently, such accesses are not verified. Datanodes are not capable of reaching those decisions independently since they don't have concepts of files, let alone file permissions.
> In order to implement data access policies consistently across HDFS services, there is a need for a mechanism by which authorization decisions made on the NameNode can be faithfully enforced on the DataNodes and any unauthorized access is declined.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4359) Support for data access authorization checking on DataNodes

Posted by "Kan Zhang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699471#action_12699471 ] 

Kan Zhang commented on HADOOP-4359:
-----------------------------------

Uploaded a new patch that addressed Raghu's initial comments. Also fixed Balancer to use access tokens. It seems OP_READ_METADATA operation on DN is not used by any clients. Shall we remove the code? 

> Support for data access authorization checking on DataNodes
> -----------------------------------------------------------
>
>                 Key: HADOOP-4359
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4359
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Kan Zhang
>            Assignee: Kan Zhang
>         Attachments: at13.patch, at19.patch
>
>
> Currently, DataNodes do not enforce any access control on accesses to its data blocks. This makes it possible for an unauthorized client to read a data block as long as she can supply its block ID. It's also possible for anyone to write arbitrary data blocks to DataNodes. 
> When users request file accesses on the NameNode, file permission checking takes place. Authorization decisions are made with regard to whether the requested accesses to those files (and implicitly, to their corresponding data blocks) are permitted. However, when it comes to subsequent data block accesses on the DataNodes, those authorization decisions are not made available to the DataNodes and consequently, such accesses are not verified. Datanodes are not capable of reaching those decisions independently since they don't have concepts of files, let alone file permissions.
> In order to implement data access policies consistently across HDFS services, there is a need for a mechanism by which authorization decisions made on the NameNode can be faithfully enforced on the DataNodes and any unauthorized access is declined.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4359) Support for data access authorization checking on DataNodes

Posted by "Kan Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kan Zhang updated HADOOP-4359:
------------------------------

    Attachment: at35.patch

> Support for data access authorization checking on DataNodes
> -----------------------------------------------------------
>
>                 Key: HADOOP-4359
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4359
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Kan Zhang
>            Assignee: Kan Zhang
>         Attachments: at13.patch, at19.patch, at31.patch, at33.patch, at34.patch, at35.patch
>
>
> Currently, DataNodes do not enforce any access control on accesses to its data blocks. This makes it possible for an unauthorized client to read a data block as long as she can supply its block ID. It's also possible for anyone to write arbitrary data blocks to DataNodes. 
> When users request file accesses on the NameNode, file permission checking takes place. Authorization decisions are made with regard to whether the requested accesses to those files (and implicitly, to their corresponding data blocks) are permitted. However, when it comes to subsequent data block accesses on the DataNodes, those authorization decisions are not made available to the DataNodes and consequently, such accesses are not verified. Datanodes are not capable of reaching those decisions independently since they don't have concepts of files, let alone file permissions.
> In order to implement data access policies consistently across HDFS services, there is a need for a mechanism by which authorization decisions made on the NameNode can be faithfully enforced on the DataNodes and any unauthorized access is declined.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4359) Support for data access authorization checking on DataNodes

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12637559#action_12637559 ] 

Doug Cutting commented on HADOOP-4359:
--------------------------------------

It would be easier to add authentication to RPC, where we need it anyway, than to have to add it to the socket-level API too.  So maybe it's time to finally experiment with HDFS data access over RPC?

> Support for data access authorization checking on DataNodes
> -----------------------------------------------------------
>
>                 Key: HADOOP-4359
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4359
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Kan Zhang
>            Assignee: Kan Zhang
>             Fix For: 0.20.0
>
>
> Currently, DataNodes do not enforce any access control on accesses to its data blocks. This makes it possible for an unauthorized client to read a data block as long as she can supply its block ID. It's also possible for anyone to write arbitrary data blocks to DataNodes. 
> When users request file accesses on the NameNode, file permission checking takes place. Authorization decisions are made with regard to whether the requested accesses to those files (and implicitly, to their corresponding data blocks) are permitted. However, when it comes to subsequent data block accesses on the DataNodes, those authorization decisions are not made available to the DataNodes and consequently, such accesses are not verified. Datanodes are not capable of reaching those decisions independently since they don't have concepts of files, let alone file permissions.
> In order to implement data access policies consistently across HDFS services, there is a need for a mechanism by which authorization decisions made on the NameNode can be faithfully enforced on the DataNodes and any unauthorized access is declined.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4359) Support for data access authorization checking on DataNodes

Posted by "Kan Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kan Zhang updated HADOOP-4359:
------------------------------

    Attachment: at19.patch

> Support for data access authorization checking on DataNodes
> -----------------------------------------------------------
>
>                 Key: HADOOP-4359
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4359
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Kan Zhang
>            Assignee: Kan Zhang
>         Attachments: at13.patch, at19.patch
>
>
> Currently, DataNodes do not enforce any access control on accesses to its data blocks. This makes it possible for an unauthorized client to read a data block as long as she can supply its block ID. It's also possible for anyone to write arbitrary data blocks to DataNodes. 
> When users request file accesses on the NameNode, file permission checking takes place. Authorization decisions are made with regard to whether the requested accesses to those files (and implicitly, to their corresponding data blocks) are permitted. However, when it comes to subsequent data block accesses on the DataNodes, those authorization decisions are not made available to the DataNodes and consequently, such accesses are not verified. Datanodes are not capable of reaching those decisions independently since they don't have concepts of files, let alone file permissions.
> In order to implement data access policies consistently across HDFS services, there is a need for a mechanism by which authorization decisions made on the NameNode can be faithfully enforced on the DataNodes and any unauthorized access is declined.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4359) Access Token: Support for data access authorization checking on DataNodes

Posted by "Kan Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kan Zhang updated HADOOP-4359:
------------------------------

    Attachment: AccessTokenDesign1.pdf

> Access Token: Support for data access authorization checking on DataNodes
> -------------------------------------------------------------------------
>
>                 Key: HADOOP-4359
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4359
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>    Affects Versions: 0.20.0
>            Reporter: Kan Zhang
>            Assignee: Kan Zhang
>             Fix For: 0.21.0
>
>         Attachments: AccessTokenDesign1.pdf, at13.patch, at19.patch, at31.patch, at33.patch, at34.patch, at35.patch, at36.patch, at37.patch, at38.patch, at39.patch, at40.patch
>
>
> Currently, DataNodes do not enforce any access control on accesses to its data blocks. This makes it possible for an unauthorized client to read a data block as long as she can supply its block ID. It's also possible for anyone to write arbitrary data blocks to DataNodes. 
> When users request file accesses on the NameNode, file permission checking takes place. Authorization decisions are made with regard to whether the requested accesses to those files (and implicitly, to their corresponding data blocks) are permitted. However, when it comes to subsequent data block accesses on the DataNodes, those authorization decisions are not made available to the DataNodes and consequently, such accesses are not verified. Datanodes are not capable of reaching those decisions independently since they don't have concepts of files, let alone file permissions.
> In order to implement data access policies consistently across HDFS services, there is a need for a mechanism by which authorization decisions made on the NameNode can be faithfully enforced on the DataNodes and any unauthorized access is declined.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4359) Support for data access authorization checking on DataNodes

Posted by "Raghu Angadi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Raghu Angadi updated HADOOP-4359:
---------------------------------

    Affects Version/s: 0.20.0
        Fix Version/s: 0.21.0

> Support for data access authorization checking on DataNodes
> -----------------------------------------------------------
>
>                 Key: HADOOP-4359
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4359
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>    Affects Versions: 0.20.0
>            Reporter: Kan Zhang
>            Assignee: Kan Zhang
>             Fix For: 0.21.0
>
>         Attachments: at13.patch, at19.patch, at31.patch, at33.patch, at34.patch, at35.patch, at36.patch, at37.patch, at38.patch, at39.patch, at40.patch
>
>
> Currently, DataNodes do not enforce any access control on accesses to its data blocks. This makes it possible for an unauthorized client to read a data block as long as she can supply its block ID. It's also possible for anyone to write arbitrary data blocks to DataNodes. 
> When users request file accesses on the NameNode, file permission checking takes place. Authorization decisions are made with regard to whether the requested accesses to those files (and implicitly, to their corresponding data blocks) are permitted. However, when it comes to subsequent data block accesses on the DataNodes, those authorization decisions are not made available to the DataNodes and consequently, such accesses are not verified. Datanodes are not capable of reaching those decisions independently since they don't have concepts of files, let alone file permissions.
> In order to implement data access policies consistently across HDFS services, there is a need for a mechanism by which authorization decisions made on the NameNode can be faithfully enforced on the DataNodes and any unauthorized access is declined.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4359) Support for data access authorization checking on DataNodes

Posted by "Kan Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kan Zhang updated HADOOP-4359:
------------------------------

    Attachment: at36.patch

> Support for data access authorization checking on DataNodes
> -----------------------------------------------------------
>
>                 Key: HADOOP-4359
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4359
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Kan Zhang
>            Assignee: Kan Zhang
>         Attachments: at13.patch, at19.patch, at31.patch, at33.patch, at34.patch, at35.patch, at36.patch
>
>
> Currently, DataNodes do not enforce any access control on accesses to its data blocks. This makes it possible for an unauthorized client to read a data block as long as she can supply its block ID. It's also possible for anyone to write arbitrary data blocks to DataNodes. 
> When users request file accesses on the NameNode, file permission checking takes place. Authorization decisions are made with regard to whether the requested accesses to those files (and implicitly, to their corresponding data blocks) are permitted. However, when it comes to subsequent data block accesses on the DataNodes, those authorization decisions are not made available to the DataNodes and consequently, such accesses are not verified. Datanodes are not capable of reaching those decisions independently since they don't have concepts of files, let alone file permissions.
> In order to implement data access policies consistently across HDFS services, there is a need for a mechanism by which authorization decisions made on the NameNode can be faithfully enforced on the DataNodes and any unauthorized access is declined.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4359) Support for data access authorization checking on DataNodes

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12637984#action_12637984 ] 

Doug Cutting commented on HADOOP-4359:
--------------------------------------

> Moving data access to an authentication-enabled RPC interface doesn't solve the authorization issue we are trying to address here.

True, but it's a start.  It would move us onto a single authentication mechanism, and that will simplify authorization.

If we're using Kerberos for authentication, we might use tickets whose authorization data contains a list of block ids.


> Support for data access authorization checking on DataNodes
> -----------------------------------------------------------
>
>                 Key: HADOOP-4359
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4359
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Kan Zhang
>            Assignee: Kan Zhang
>             Fix For: 0.20.0
>
>
> Currently, DataNodes do not enforce any access control on accesses to its data blocks. This makes it possible for an unauthorized client to read a data block as long as she can supply its block ID. It's also possible for anyone to write arbitrary data blocks to DataNodes. 
> When users request file accesses on the NameNode, file permission checking takes place. Authorization decisions are made with regard to whether the requested accesses to those files (and implicitly, to their corresponding data blocks) are permitted. However, when it comes to subsequent data block accesses on the DataNodes, those authorization decisions are not made available to the DataNodes and consequently, such accesses are not verified. Datanodes are not capable of reaching those decisions independently since they don't have concepts of files, let alone file permissions.
> In order to implement data access policies consistently across HDFS services, there is a need for a mechanism by which authorization decisions made on the NameNode can be faithfully enforced on the DataNodes and any unauthorized access is declined.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4359) Support for data access authorization checking on DataNodes

Posted by "Kan Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kan Zhang updated HADOOP-4359:
------------------------------

    Attachment: at39.patch

submitting a new patch to keep up with trunk changes.

> Support for data access authorization checking on DataNodes
> -----------------------------------------------------------
>
>                 Key: HADOOP-4359
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4359
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Kan Zhang
>            Assignee: Kan Zhang
>         Attachments: at13.patch, at19.patch, at31.patch, at33.patch, at34.patch, at35.patch, at36.patch, at37.patch, at38.patch, at39.patch
>
>
> Currently, DataNodes do not enforce any access control on accesses to its data blocks. This makes it possible for an unauthorized client to read a data block as long as she can supply its block ID. It's also possible for anyone to write arbitrary data blocks to DataNodes. 
> When users request file accesses on the NameNode, file permission checking takes place. Authorization decisions are made with regard to whether the requested accesses to those files (and implicitly, to their corresponding data blocks) are permitted. However, when it comes to subsequent data block accesses on the DataNodes, those authorization decisions are not made available to the DataNodes and consequently, such accesses are not verified. Datanodes are not capable of reaching those decisions independently since they don't have concepts of files, let alone file permissions.
> In order to implement data access policies consistently across HDFS services, there is a need for a mechanism by which authorization decisions made on the NameNode can be faithfully enforced on the DataNodes and any unauthorized access is declined.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4359) Support for data access authorization checking on DataNodes

Posted by "Kan Zhang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12637771#action_12637771 ] 

Kan Zhang commented on HADOOP-4359:
-----------------------------------

Moving data access to an authentication-enabled RPC interface doesn't solve the authorization issue we are trying to address here. Even if the DataNodes can authenticate users, they still need to know whether a user has the right to access certain data blocks. Such authorization information can only come from the NameNode.

> Support for data access authorization checking on DataNodes
> -----------------------------------------------------------
>
>                 Key: HADOOP-4359
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4359
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Kan Zhang
>            Assignee: Kan Zhang
>             Fix For: 0.20.0
>
>
> Currently, DataNodes do not enforce any access control on accesses to its data blocks. This makes it possible for an unauthorized client to read a data block as long as she can supply its block ID. It's also possible for anyone to write arbitrary data blocks to DataNodes. 
> When users request file accesses on the NameNode, file permission checking takes place. Authorization decisions are made with regard to whether the requested accesses to those files (and implicitly, to their corresponding data blocks) are permitted. However, when it comes to subsequent data block accesses on the DataNodes, those authorization decisions are not made available to the DataNodes and consequently, such accesses are not verified. Datanodes are not capable of reaching those decisions independently since they don't have concepts of files, let alone file permissions.
> In order to implement data access policies consistently across HDFS services, there is a need for a mechanism by which authorization decisions made on the NameNode can be faithfully enforced on the DataNodes and any unauthorized access is declined.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4359) Support for data access authorization checking on DataNodes

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12707614#action_12707614 ] 

Hadoop QA commented on HADOOP-4359:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12407587/at37.patch
  against trunk revision 772960.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 15 new or modified tests.

    -1 patch.  The patch command could not apply the patch.

Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/314/console

This message is automatically generated.

> Support for data access authorization checking on DataNodes
> -----------------------------------------------------------
>
>                 Key: HADOOP-4359
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4359
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Kan Zhang
>            Assignee: Kan Zhang
>         Attachments: at13.patch, at19.patch, at31.patch, at33.patch, at34.patch, at35.patch, at36.patch, at37.patch
>
>
> Currently, DataNodes do not enforce any access control on accesses to its data blocks. This makes it possible for an unauthorized client to read a data block as long as she can supply its block ID. It's also possible for anyone to write arbitrary data blocks to DataNodes. 
> When users request file accesses on the NameNode, file permission checking takes place. Authorization decisions are made with regard to whether the requested accesses to those files (and implicitly, to their corresponding data blocks) are permitted. However, when it comes to subsequent data block accesses on the DataNodes, those authorization decisions are not made available to the DataNodes and consequently, such accesses are not verified. Datanodes are not capable of reaching those decisions independently since they don't have concepts of files, let alone file permissions.
> In order to implement data access policies consistently across HDFS services, there is a need for a mechanism by which authorization decisions made on the NameNode can be faithfully enforced on the DataNodes and any unauthorized access is declined.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4359) Support for data access authorization checking on DataNodes

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12708900#action_12708900 ] 

Hadoop QA commented on HADOOP-4359:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12407938/at40.patch
  against trunk revision 774232.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 15 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    -1 contrib tests.  The patch failed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/333/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/333/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/333/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/333/console

This message is automatically generated.

> Support for data access authorization checking on DataNodes
> -----------------------------------------------------------
>
>                 Key: HADOOP-4359
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4359
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Kan Zhang
>            Assignee: Kan Zhang
>         Attachments: at13.patch, at19.patch, at31.patch, at33.patch, at34.patch, at35.patch, at36.patch, at37.patch, at38.patch, at39.patch, at40.patch
>
>
> Currently, DataNodes do not enforce any access control on accesses to its data blocks. This makes it possible for an unauthorized client to read a data block as long as she can supply its block ID. It's also possible for anyone to write arbitrary data blocks to DataNodes. 
> When users request file accesses on the NameNode, file permission checking takes place. Authorization decisions are made with regard to whether the requested accesses to those files (and implicitly, to their corresponding data blocks) are permitted. However, when it comes to subsequent data block accesses on the DataNodes, those authorization decisions are not made available to the DataNodes and consequently, such accesses are not verified. Datanodes are not capable of reaching those decisions independently since they don't have concepts of files, let alone file permissions.
> In order to implement data access policies consistently across HDFS services, there is a need for a mechanism by which authorization decisions made on the NameNode can be faithfully enforced on the DataNodes and any unauthorized access is declined.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4359) Support for data access authorization checking on DataNodes

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12702881#action_12702881 ] 

Hadoop QA commented on HADOOP-4359:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12406413/at31.patch
  against trunk revision 768376.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 15 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs.  The patch appears to introduce 4 new Findbugs warnings.

    +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/246/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/246/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/246/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/246/console

This message is automatically generated.

> Support for data access authorization checking on DataNodes
> -----------------------------------------------------------
>
>                 Key: HADOOP-4359
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4359
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Kan Zhang
>            Assignee: Kan Zhang
>         Attachments: at13.patch, at19.patch, at31.patch
>
>
> Currently, DataNodes do not enforce any access control on accesses to its data blocks. This makes it possible for an unauthorized client to read a data block as long as she can supply its block ID. It's also possible for anyone to write arbitrary data blocks to DataNodes. 
> When users request file accesses on the NameNode, file permission checking takes place. Authorization decisions are made with regard to whether the requested accesses to those files (and implicitly, to their corresponding data blocks) are permitted. However, when it comes to subsequent data block accesses on the DataNodes, those authorization decisions are not made available to the DataNodes and consequently, such accesses are not verified. Datanodes are not capable of reaching those decisions independently since they don't have concepts of files, let alone file permissions.
> In order to implement data access policies consistently across HDFS services, there is a need for a mechanism by which authorization decisions made on the NameNode can be faithfully enforced on the DataNodes and any unauthorized access is declined.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.