You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@kyuubi.apache.org by "praveenkumarb1207 (via GitHub)" <gi...@apache.org> on 2023/01/25 18:28:18 UTC
[GitHub] [kyuubi] praveenkumarb1207 opened a new issue, #4202: [Bug] Issue when applying Masking policies on Iceberg Tables
praveenkumarb1207 opened a new issue, #4202:
URL: https://github.com/apache/kyuubi/issues/4202
### Code of Conduct
- [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
### Search before asking
- [X] I have searched in the [issues](https://github.com/apache/kyuubi/issues?q=is%3Aissue) and found no similar issues.
### Describe the bug
Following the instructions in https://github.com/apache/kyuubi/blob/master/docs/security/authorization/spark/install.md , I have installed Kyuubi plugin by copying all the jars and required configuration files as mentioned in the link to $SPARK_HOME .
**Following is the version Information :**
Ranger Version - 2.3.0
Spark Version - Spark 3.3.4 with Hadoop 3.3.4
Apache Hive 3.1.2
I have created an Iceberg table in Hive using spark .
**spark-shell command :**
```
spark-shell --packages "org.apache.spark:spark-hive_2.12:3.2.3,org.apache.hadoop:hadoop-aws:3.3.4,org.apache.iceberg:iceberg-spark-runtime-3.2_2.12:1.1.0" \
--conf spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions,org.apache.kyuubi.plugin.spark.authz.ranger.RangerSparkExtension \
--conf spark.sql.catalog.spark_catalog=org.apache.iceberg.spark.SparkSessionCatalog \
--conf spark.sql.catalog.spark_catalog.type=hive \
--conf spark.sql.catalog.demo=org.apache.iceberg.spark.SparkCatalog \
--conf spark.sql.catalog.demo.type=hadoop \
--conf spark.sql.catalog.demo.warehouse=$PWD/warehouse --proxy-user pb1207
```
**Code :**
```
import spark.implicits._
val df = Seq(
(1, "America"),
(2, "India"),
(3, "London")
).toDF("id", "country")
df.write.format("iceberg").mode("overwrite").option("path","s3a:/bucket/test/test_iceberg.parquet").saveAsTable("test.test_iceberg_table")
```
Iceberg table got successfully created in Hive .
**code :**
`spark.sql("show create table test.test_iceberg_table").show(false)`
**output :**
![image](https://user-images.githubusercontent.com/59603987/214647412-f15969ba-3dca-4913-a02c-038cd3ab944b.png)
**Selecting table in spark :**
**code :**
`spark.sql("select * from test.test_iceberg_table").show(false) `
**output :**
![image](https://user-images.githubusercontent.com/59603987/214647676-7f5991d2-c692-4484-8470-80206db1b6b0.png)
Created a Access policy in Ranger on the Iceberg table and its was working as expected .
**Ranger Policy :**
![image](https://user-images.githubusercontent.com/59603987/214648714-38e6c704-4520-4f23-b6a2-6d22f194e89b.png)
![image](https://user-images.githubusercontent.com/59603987/214648787-f69e7927-a621-4f2c-a71c-066c24e18c53.png)
**Output from spark :**
![image](https://user-images.githubusercontent.com/59603987/214648874-3eb825eb-4b05-4321-a4ef-2ae0468abded.png)
I have given the access to table in Ranger and created a Row level filtering policy in Ranger on the Iceberg table and its was working as expected .
**Ranger Policy :**
![image](https://user-images.githubusercontent.com/59603987/214649110-bf5697ac-ab43-44f5-b4cc-36774730170b.png)
![image](https://user-images.githubusercontent.com/59603987/214649327-298afa19-6f86-4e40-b877-55b7f407c3e0.png)
**Output from spark :**
![image](https://user-images.githubusercontent.com/59603987/214649606-7823133c-393b-49c5-ab09-bb792fde7840.png)
But when I create a masking policy , I am facing the below issue.
**Ranger policy :**
![image](https://user-images.githubusercontent.com/59603987/214649969-8c1da8b6-0af6-470f-aa93-a165ea9942ab.png)
**Output :**
![image](https://user-images.githubusercontent.com/59603987/214650099-8a35274f-b54d-472c-b95f-efd12f4f06d8.png)
**Error :**
```
org.apache.spark.sql.AnalysisException: Resolved attribute(s) country#114 missing from id#113,country#115 in operator !Project [id#113, country#114]. Attribute(s) with the same name appear in the operation: country. Please check if the right attribute(s) are used.;
!Project [id#113, country#114]
+- SubqueryAlias spark_catalog.test.test_iceberg_table
+- Project [id#113, null AS country#115]
+- Filter (id#113 > 1)
+- RowFilterAndDataMaskingMarker
+- RelationV2[id#113, country#114] spark_catalog.test.test_iceberg_table
at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.failAnalysis(CheckAnalysis.scala:52)
at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.failAnalysis$(CheckAnalysis.scala:51)
at org.apache.spark.sql.catalyst.analysis.Analyzer.failAnalysis(Analyzer.scala:182)
at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis$1(CheckAnalysis.scala:474)
at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis$1$adapted(CheckAnalysis.scala:97)
at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:263)
at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.checkAnalysis(CheckAnalysis.scala:97)
at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.checkAnalysis$(CheckAnalysis.scala:92)
at org.apache.spark.sql.catalyst.analysis.Analyzer.checkAnalysis(Analyzer.scala:182)
at org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$executeAndCheck$1(Analyzer.scala:205)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.markInAnalyzer(AnalysisHelper.scala:330)
at org.apache.spark.sql.catalyst.analysis.Analyzer.executeAndCheck(Analyzer.scala:202)
at org.apache.spark.sql.execution.QueryExecution.$anonfun$analyzed$1(QueryExecution.scala:75)
at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:111)
at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:183)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
at org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:183)
at org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:75)
at org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:73)
at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:65)
at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:98)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:96)
at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:618)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:613)
... 54 elided
```
Can you please look into this issue ?
### Affects Version(s)
master
### Kyuubi Server Log Output
_No response_
### Kyuubi Engine Log Output
_No response_
### Kyuubi Server Configurations
_No response_
### Kyuubi Engine Configurations
_No response_
### Additional context
_No response_
### Are you willing to submit PR?
- [ ] Yes. I would be willing to submit a PR with guidance from the Kyuubi community to fix.
- [X] No. I cannot submit a PR at this time.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org
[GitHub] [kyuubi] praveenkumarb1207 commented on issue #4202: [Bug] Issue when applying Masking policies on Iceberg Tables
Posted by "praveenkumarb1207 (via GitHub)" <gi...@apache.org>.
praveenkumarb1207 commented on issue #4202:
URL: https://github.com/apache/kyuubi/issues/4202#issuecomment-1404550193
Yes , I am using the master branch .
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org
[GitHub] [kyuubi] bowenliang123 commented on issue #4202: [Bug] Issue when applying Masking policies on Iceberg Tables
Posted by "bowenliang123 (via GitHub)" <gi...@apache.org>.
bowenliang123 commented on issue #4202:
URL: https://github.com/apache/kyuubi/issues/4202#issuecomment-1404565511
Okay, more investigation is required.
For row filter in iceberg, it's covered in `IcebergCatalogRangerSparkExtensionSuite` using `org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions`.
https://github.com/apache/kyuubi/blob/master/extensions/spark/kyuubi-spark-authz/src/test/scala/org/apache/kyuubi/plugin/spark/authz/ranger/IcebergCatalogRangerSparkExtensionSuite.scala#L174
Have a look at it if you have time to do more testing. Feel free to show any detail or discovery.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org
[GitHub] [kyuubi] bowenliang123 commented on issue #4202: [Bug] Issue when applying Masking policies on Iceberg Tables
Posted by "bowenliang123 (via GitHub)" <gi...@apache.org>.
bowenliang123 commented on issue #4202:
URL: https://github.com/apache/kyuubi/issues/4202#issuecomment-1406237738
Okay, well noticed.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org
[GitHub] [kyuubi] bowenliang123 commented on issue #4202: [Bug] Issue when applying Masking policies on Iceberg Tables
Posted by "bowenliang123 (via GitHub)" <gi...@apache.org>.
bowenliang123 commented on issue #4202:
URL: https://github.com/apache/kyuubi/issues/4202#issuecomment-1404433109
Are you compiling and using the Authz module from the master branch?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org
[GitHub] [kyuubi] praveenkumarb1207 commented on issue #4202: [Bug] Issue when applying Masking policies on Iceberg Tables
Posted by "praveenkumarb1207 (via GitHub)" <gi...@apache.org>.
praveenkumarb1207 commented on issue #4202:
URL: https://github.com/apache/kyuubi/issues/4202#issuecomment-1406078677
Hi @bowenliang123 ,
Just to be clear , We are facing issue only with the Masking . Row level filtering is working as expected .
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org
[GitHub] [kyuubi] yaooqinn closed issue #4202: [Bug] Issue when applying Masking policies on Iceberg Tables
Posted by "yaooqinn (via GitHub)" <gi...@apache.org>.
yaooqinn closed issue #4202: [Bug] Issue when applying Masking policies on Iceberg Tables
URL: https://github.com/apache/kyuubi/issues/4202
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org