You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@spark.apache.org by "0.0" <40...@qq.com> on 2017/11/29 07:01:58 UTC

回复： In structured streamin, multiple streaming aggregations are notyet supported.

Hi, Das:
	Thanks for your answer.
	I'm talking about multiple streaming aggregations here is :
	
		df.groupBy("key").agg(min("colA").as("min")).groupBy("min").count()
	
	EG: The data source is the user login record.There are two fields in my temp view (USER_LOGIN_TABLE): user_id, is_login.Then figure out the number of users who logged in more than 3 times in 5 minutes.My first SQL is:
		SELECT user_id,count(1) as failed_num
		FROM USER_LOGIN_TABLE
		WHERE login_failed
	I took the last SQL is a new temp view (USER_FAILED_TABLE).Then the second SQL is:
		SELECT count(user_id)
		FROM USER_FAILED_TABLE
		WHERE failed_num>=3
	
	Thanks.





------------------ ---------------------------------------------------------------原始邮件--------------------------------------------------------------------------------------------------------------- ------------------
Hello, 


What do you mean by multiple streaming aggregations? Something like this is already supported.

df.groupBy("key").agg(min("colA"), max("colB"), avg("colC"))


But the following is not supported. 


df.groupBy("key").agg(min("colA").as("min")).groupBy("min").count()


In other words, multiple aggregations ONE AFTER ANOTHER is NOT supported yet, and we currently don't have any plans to support it by 2.3. 


If this is what you want, then can you explain the use case of why you want multiple aggregation


On Tue, Nov 28, 2017 at 9:46 PM, Georg Heiler <ge...@gmail.com> wrote:
2.3 around January 
0.0 <40...@qq.com> schrieb am Mi. 29. Nov. 2017 um 05:08:

Hi, all:
    Multiple streaming aggregations are not yet supported. When will it be supported? Is it in the plan?


Thanks.