You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/02/16 11:54:27 UTC

[GitHub] [spark] HeartSaVioR opened a new pull request #35543: [SPARK-38227][SQL][SS] Apply strict nullability of nested column in time window / session window

HeartSaVioR opened a new pull request #35543:
URL: https://github.com/apache/spark/pull/35543


   ### What changes were proposed in this pull request?
   
   This PR proposes to apply strict nullability of nested column in window struct for both time window and session window, which respects the dataType of TimeWindow and SessionWindow.
   
   ### Why are the changes needed?
   
   The implementation of rule TimeWindowing and SessionWindowing have been exposed the possible risks of inconsistency between the dataType of TimeWindow/SessionWindow and the replacement. For the replacement, it is possible that optimizer may decide the value expressions to be non-nullable.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No.
   
   ### How was this patch tested?
   
   New tests added.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR commented on pull request #35543: [SPARK-38227][SQL][SS] Apply strict nullability of nested column in time window / session window

Posted by GitBox <gi...@apache.org>.
HeartSaVioR commented on pull request #35543:
URL: https://github.com/apache/spark/pull/35543#issuecomment-1046496500


   Thanks all for reviewing and merging!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR commented on pull request #35543: [SPARK-38227][SQL][SS] Apply strict nullability of nested column in time window / session window

Posted by GitBox <gi...@apache.org>.
HeartSaVioR commented on pull request #35543:
URL: https://github.com/apache/spark/pull/35543#issuecomment-1042102216


   cc. @cloud-fan @sigmod @viirya @xuanyuanking 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR commented on pull request #35543: [SPARK-38227][SQL][SS] Apply strict nullability of nested column in time window / session window

Posted by GitBox <gi...@apache.org>.
HeartSaVioR commented on pull request #35543:
URL: https://github.com/apache/spark/pull/35543#issuecomment-1042574341


   It would be ideal if we have a great idea to technically prevent such case (code level, or at least runtime level like fast-fail the query) instead of adding tests to guard against. I don't know whether it is technically feasible or not, just thinking ideally.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] viirya commented on a change in pull request #35543: [SPARK-38227][SQL][SS] Apply strict nullability of nested column in time window / session window

Posted by GitBox <gi...@apache.org>.
viirya commented on a change in pull request #35543:
URL: https://github.com/apache/spark/pull/35543#discussion_r808792856



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/constraintExpressions.scala
##########
@@ -30,6 +30,17 @@ trait TaggingExpression extends UnaryExpression {
   override def eval(input: InternalRow): Any = child.eval(input)
 }
 
+case class KnownNullable(child: Expression) extends TaggingExpression {
+  override def nullable: Boolean = true
+
+  override protected def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
+    child.genCode(ctx)

Review comment:
       `child.genCode(ctx).copy(isNull = TrueLiteral)`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] viirya commented on pull request #35543: [SPARK-38227][SQL][SS] Apply strict nullability of nested column in time window / session window

Posted by GitBox <gi...@apache.org>.
viirya commented on pull request #35543:
URL: https://github.com/apache/spark/pull/35543#issuecomment-1046495112


   Thanks. Merging to master.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] viirya closed pull request #35543: [SPARK-38227][SQL][SS] Apply strict nullability of nested column in time window / session window

Posted by GitBox <gi...@apache.org>.
viirya closed pull request #35543:
URL: https://github.com/apache/spark/pull/35543


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HeartSaVioR commented on pull request #35543: [SPARK-38227][SQL][SS] Apply strict nullability of nested column in time window / session window

Posted by GitBox <gi...@apache.org>.
HeartSaVioR commented on pull request #35543:
URL: https://github.com/apache/spark/pull/35543#issuecomment-1046396879


   cc. @cloud-fan @sigmod @viirya @xuanyuanking Friendly reminder.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org