You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by "aokolnychyi (via GitHub)" <gi...@apache.org> on 2023/01/20 20:04:35 UTC
[GitHub] [iceberg] aokolnychyi opened a new pull request, #6633: Spark 3.3: Fix predicate pushdown for copy-on-write MERGE commands
aokolnychyi opened a new pull request, #6633:
URL: https://github.com/apache/iceberg/pull/6633
This PR fixes predicate pushdown for copy-on-write MERGE commands, which was broken after #6534. This change contains a test that would previously fail and lead to a data correctness issue.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org
[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6633: Spark 3.3: Fix predicate pushdown for copy-on-write MERGE commands
Posted by "aokolnychyi (via GitHub)" <gi...@apache.org>.
aokolnychyi commented on code in PR #6633:
URL: https://github.com/apache/iceberg/pull/6633#discussion_r1082998249
##########
spark/v3.3/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/analysis/RewriteMergeIntoTable.scala:
##########
@@ -187,14 +187,12 @@ object RewriteMergeIntoTable extends RewriteRowLevelIcebergCommand with Predicat
val readRelation = buildRelationWithAttrs(relation, operationTable, metadataAttrs)
val readAttrs = readRelation.output
- val (targetCond, joinCond) = splitMergeCond(cond, readRelation)
Review Comment:
I reverted changes in #6534 for copy-on-write operations. It was not safe as pushing the join condition into a filter on the left side is not safe in LeftOuter and FullOuter joins. It changes the output, which can lead to loosing records that did not match the condition (see the new test).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org
[GitHub] [iceberg] aokolnychyi merged pull request #6633: Spark 3.3: Fix predicate pushdown for copy-on-write MERGE commands
Posted by "aokolnychyi (via GitHub)" <gi...@apache.org>.
aokolnychyi merged PR #6633:
URL: https://github.com/apache/iceberg/pull/6633
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org
[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #6633: Spark 3.3: Fix predicate pushdown for copy-on-write MERGE commands
Posted by "aokolnychyi (via GitHub)" <gi...@apache.org>.
aokolnychyi commented on code in PR #6633:
URL: https://github.com/apache/iceberg/pull/6633#discussion_r1082998249
##########
spark/v3.3/spark-extensions/src/main/scala/org/apache/spark/sql/catalyst/analysis/RewriteMergeIntoTable.scala:
##########
@@ -187,14 +187,12 @@ object RewriteMergeIntoTable extends RewriteRowLevelIcebergCommand with Predicat
val readRelation = buildRelationWithAttrs(relation, operationTable, metadataAttrs)
val readAttrs = readRelation.output
- val (targetCond, joinCond) = splitMergeCond(cond, readRelation)
Review Comment:
I reverted changes in #6534 for copy-on-write operations. It was not safe as pushing the join condition into a filter on the left side is not safe in LeftOuter and FullOuter joins. It changes the output, which can lead to loosing records that did not match the condition.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org
[GitHub] [iceberg] aokolnychyi commented on pull request #6633: Spark 3.3: Fix predicate pushdown for copy-on-write MERGE commands
Posted by "aokolnychyi (via GitHub)" <gi...@apache.org>.
aokolnychyi commented on PR #6633:
URL: https://github.com/apache/iceberg/pull/6633#issuecomment-1401352111
Thank you, @amogh-jahagirdar @RussellSpitzer!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org