You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Saiph Kappa <sa...@gmail.com> on 2016/01/13 18:22:49 UTC

Flink DataStream and KeyBy

Hi,

This line «stream.keyBy(0)» only works if stream is of type
DataStream[Tuple] - and this Tuple is not a scala tuple but a flink tuple
(why not to use scala Tuple?). Currently keyBy can be applied to anything
(at least in scala) like DataStream[String] and
DataStream[Array[String]].

Can anyone confirm me this?

Thanks.

Re: Flink DataStream and KeyBy

Posted by Aljoscha Krettek <al...@apache.org>.
Hi,
using .keyBy(0) on a Scala DataStream[Tuple2] where Tuple2 is a Scala Tuple should work. Look, for example, at the SocketTextStreamWordCount example in Flink.

Cheers,
Aljoscha
> On 13 Jan 2016, at 18:25, Tzu-Li (Gordon) Tai <tz...@gmail.com> wrote:
> 
> Hi Saiph,
> 
> In Flink, the key for keyBy() can be provided in different ways:
> https://ci.apache.org/projects/flink/flink-docs-master/apis/programming_guide.html#specifying-keys
> (the doc is for DataSet API, but specifying keys is basically the same for
> DataStream and DataSet).
> 
> As described in the documentation, calls like keyBy(0) are meant for Tuples,
> so it only works for DataStream[Tuple]. Other key definition types like
> keyBy(new KeySelector() {...}) can basically take any DataStream of
> arbitrary data type. Flink finds out whether or not there is a conflict
> between the type of the data in the DataStream and the way the key is
> defined at runtime.
> 
> Hope this helps!
> 
> Cheers,
> Gordon
> 
> 
> 
> 
> 
> --
> View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-DataStream-and-KeyBy-tp4271p4272.html
> Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.


Re: Flink DataStream and KeyBy

Posted by "Tzu-Li (Gordon) Tai" <tz...@gmail.com>.
Hi Saiph,

In Flink, the key for keyBy() can be provided in different ways:
https://ci.apache.org/projects/flink/flink-docs-master/apis/programming_guide.html#specifying-keys
(the doc is for DataSet API, but specifying keys is basically the same for
DataStream and DataSet).

As described in the documentation, calls like keyBy(0) are meant for Tuples,
so it only works for DataStream[Tuple]. Other key definition types like
keyBy(new KeySelector() {...}) can basically take any DataStream of
arbitrary data type. Flink finds out whether or not there is a conflict
between the type of the data in the DataStream and the way the key is
defined at runtime.

Hope this helps!

Cheers,
Gordon





--
View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-DataStream-and-KeyBy-tp4271p4272.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.