You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Jungtaek Lim (Jira)" <ji...@apache.org> on 2021/11/22 20:38:00 UTC

[jira] [Commented] (SPARK-37439) org.apache.spark.sql.AnalysisException: Non-time-based windows are not supported on streaming DataFrames/Datasets;; despite of time-based window

    [ https://issues.apache.org/jira/browse/SPARK-37439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17447616#comment-17447616 ] 

Jungtaek Lim commented on SPARK-37439:
--------------------------------------

Hi,

By time-window we described what time windows are supported in SS natively.

[http://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#types-of-time-windows]

Window spec is not supported. This defines the boundary of window as non-timed manner, the offset(s) of the row, which is hard to track in streaming context.

We have mailing list group for users. Please go through users mailing list if you have questions.

[http://spark.apache.org/community.html]

Thanks!

> org.apache.spark.sql.AnalysisException: Non-time-based windows are not supported on streaming DataFrames/Datasets;; despite of time-based window
> ------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-37439
>                 URL: https://issues.apache.org/jira/browse/SPARK-37439
>             Project: Spark
>          Issue Type: Bug
>          Components: Structured Streaming
>    Affects Versions: 3.0.1
>            Reporter: Ilya
>            Priority: Major
>
> Initially posted here: [https://stackoverflow.com/questions/70062355/org-apache-spark-sql-analysisexception-non-time-based-windows-are-not-supported]
>  
> 'm doing the window-based sorting for the Spark Structured Streaming:
>  
> {{val filterWindow: WindowSpec = Window  .partitionBy("key")
>   .orderBy($"time")
> controlDataFrame=controlDataFrame.withColumn("Make Coffee", $"value").    
>   withColumn("datetime", date_trunc("second", current_timestamp())).
>   withColumn("time", current_timestamp()).
>   withColumn("temp_rank", rank().over(filterWindow))
>   .filter(col("temp_rank") === 1)
>   .drop("temp_rank").
>   withColumn("digitalTwinId", lit(digitalTwinId)).
>   withWatermark("datetime", "10 seconds")}}
> I'm obtaining {{time}} as {{current_timestamp()}} and in schemat I see its type as {{StructField(time,TimestampType,true)}}
> Why Spark 3.0 doesn't allow me to do the window operation based on it with the following exception, as the filed is clearly time-based?
>  
> {{21/11/22 10:34:03 WARN SparkSession$Builder: Using an existing SparkSession; some spark core configurations may not take effect.
> org.apache.spark.sql.AnalysisException: Non-time-based windows are not supported on streaming DataFrames/Datasets;;Window [rank(time#163) windowspecdefinition(key#150, time#163 ASC NULLS FIRST, specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS temp_rank#171], [key#150], [time#163 ASC NULLS FIRST]
> +- Project [key#150, value#151, Make Coffee#154, datetime#158, time#163]}}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org