You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by "aokolnychyi (via GitHub)" <gi...@apache.org> on 2023/01/20 20:04:35 UTC

[GitHub] [iceberg] aokolnychyi opened a new pull request, #6633: Spark 3.3: Fix predicate pushdown for copy-on-write MERGE commands

aokolnychyi opened a new pull request, #6633:
URL: https://github.com/apache/iceberg/pull/6633

   This PR fixes predicate pushdown for copy-on-write MERGE commands, which was broken after #6534. This change contains a test that would previously fail and lead to a data correctness issue.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6633: Spark 3.3: Fix predicate pushdown for copy-on-write MERGE commands

Posted by "aokolnychyi (via GitHub)" <gi...@apache.org>.
aokolnychyi commented on code in PR #6633:
URL: https://github.com/apache/iceberg/pull/6633#discussion_r1082998249


##########
spark/v3.3/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/analysis/RewriteMergeIntoTable.scala:
##########
@@ -187,14 +187,12 @@ object RewriteMergeIntoTable extends RewriteRowLevelIcebergCommand with Predicat
     val readRelation = buildRelationWithAttrs(relation, operationTable, metadataAttrs)
     val readAttrs = readRelation.output
 
-    val (targetCond, joinCond) = splitMergeCond(cond, readRelation)

Review Comment:
   I reverted changes in #6534 for copy-on-write operations. It was not safe as pushing the join condition into a filter on the left side is not safe in LeftOuter and FullOuter joins. It changes the output, which can lead to loosing records that did not match the condition (see the new test).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi merged pull request #6633: Spark 3.3: Fix predicate pushdown for copy-on-write MERGE commands

Posted by "aokolnychyi (via GitHub)" <gi...@apache.org>.
aokolnychyi merged PR #6633:
URL: https://github.com/apache/iceberg/pull/6633


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6633: Spark 3.3: Fix predicate pushdown for copy-on-write MERGE commands

Posted by "aokolnychyi (via GitHub)" <gi...@apache.org>.
aokolnychyi commented on code in PR #6633:
URL: https://github.com/apache/iceberg/pull/6633#discussion_r1082998249


##########
spark/v3.3/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/analysis/RewriteMergeIntoTable.scala:
##########
@@ -187,14 +187,12 @@ object RewriteMergeIntoTable extends RewriteRowLevelIcebergCommand with Predicat
     val readRelation = buildRelationWithAttrs(relation, operationTable, metadataAttrs)
     val readAttrs = readRelation.output
 
-    val (targetCond, joinCond) = splitMergeCond(cond, readRelation)

Review Comment:
   I reverted changes in #6534 for copy-on-write operations. It was not safe as pushing the join condition into a filter on the left side is not safe in LeftOuter and FullOuter joins. It changes the output, which can lead to loosing records that did not match the condition.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on pull request #6633: Spark 3.3: Fix predicate pushdown for copy-on-write MERGE commands

Posted by "aokolnychyi (via GitHub)" <gi...@apache.org>.
aokolnychyi commented on PR #6633:
URL: https://github.com/apache/iceberg/pull/6633#issuecomment-1401352111

   Thank you, @amogh-jahagirdar @RussellSpitzer!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org