You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@submarine.apache.org by "Pedro Rossi (Jira)" <ji...@apache.org> on 2020/07/09 14:37:00 UTC
[jira] [Commented] (SUBMARINE-562) Secure raw read and writes to hdfs

    [ https://issues.apache.org/jira/browse/SUBMARINE-562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17154632#comment-17154632 ] 

Pedro Rossi commented on SUBMARINE-562:
---------------------------------------

[~Qin Yao] I am tagging you since you were responsible for the project before merging with Submarine and I also supose you were responsible for most of the current design of the security module as well. Also huge thanks for your work on this module as well!

> Secure raw read and writes to hdfs
> ----------------------------------
>
>                 Key: SUBMARINE-562
>                 URL: https://issues.apache.org/jira/browse/SUBMARINE-562
>             Project: Apache Submarine
>          Issue Type: Improvement
>          Components: Security
>            Reporter: Pedro Rossi
>            Priority: Minor
>
> I was testing the security plugin inside my company and I noticed that either running a "select * from table" or reading directly the table path on hdfs produces the same plan but in the raw path read it shows the path URI only and this is not considered into the PrivilegesBuilder class, I designed an internal patch for this module at my company to address this issue by adding this to the buildQuery function
> {code:java}
> case l: LogicalRelation =>
> if (l.catalogTable.nonEmpty) {
>   mergeProjection(l.catalogTable.get)
> } else if (l.relation.isInstanceOf[HadoopFsRelation]) {
>   for (path <- l.relation.asInstanceOf[HadoopFsRelation].location.rootPaths)
>     privilegeObjects += new SparkPrivilegeObject(
>       SparkPrivilegeObjectType.DFS_URI, path.toString, path.toString)
> }
> {code}
> and this to the buildCommand function
> {code:java}
> case i: InsertIntoHadoopFsRelationCommand =>
> i.catalogTable foreach { t =>
>   addTableOrViewLevelObjs(
>     t.identifier,
>     outputObjs,
>     i.partitionColumns.map(_.name),
>     t.schema.fieldNames)
> }
> if (i.catalogTable.isEmpty) {
>   outputObjs += new SparkPrivilegeObject(
>     SparkPrivilegeObjectType.DFS_URI, i.outputPath.toString, i.outputPath.toString)
> }
> {code}
> but I get this project proposes Hive authorization and not HDFS authorization, but even so people in the Spark environment tend to write temporary files without metastore tables also and this should pass through authorization.
> I am creating this issue in order to ask the maintainers if this is relevant and if this is in the same scope of the Security module in order for me to provide a patch for this.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@submarine.apache.org
For additional commands, e-mail: dev-help@submarine.apache.org