You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Aljoscha Krettek (JIRA)" <ji...@apache.org> on 2016/04/13 16:29:25 UTC

[jira] [Commented] (FLINK-3751) default Operator names are inconsistent

    [ https://issues.apache.org/jira/browse/FLINK-3751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15239333#comment-15239333 ] 

Aljoscha Krettek commented on FLINK-3751:
-----------------------------------------

+1

> default Operator names are inconsistent
> ---------------------------------------
>
>                 Key: FLINK-3751
>                 URL: https://issues.apache.org/jira/browse/FLINK-3751
>             Project: Flink
>          Issue Type: Bug
>          Components: DataSet API, DataStream API
>    Affects Versions: 1.0.1
>            Reporter: Chesnay Schepler
>            Priority: Minor
>
> h3. The Problem
> If a user doesn't name an operator explicitly (generally using the name() method) then Flink auto generates a name. These generated names are really (like, _really_) inconsistent within and across API's.
> In the batch API non-source/-sink operator names are _generally_ formed like this:
> {code}FlatMap (FlatMap at main(WordCount.java:81)){code}
> We have
> * FlatMap, describing the runtime operator type
> * another FlatMap, describing which user-call created this operator
> * main(WordCount.java:81), describing the call location
> This already falls apart when you have a DataSource, which looks like this:
> {code}DataSource (at getDefaultTextLineDataSet(WordCountData.java:70) (org.apache.flink.CollectionInputFormat){code}
> It is missing the call that created the sink (fromElements()) and suddenly includes the inputFormat name.
> Sink are a different story yet again, since collect() is displayed as
> {code} DataSink (collect()) {code}
> which is missing the call location.
> Then we have the Streaming API  where things are named completely different as well:
> The fromElements source is displayed as 
> {code} Source: Collection Source {code}
> non-source/-sink operators are displayed simply as their runtime operator type
> {code} FlatMap {code}
> and sinks, at times, do not have a name at all.
> {code} Sink: Unnamed {code}
> To put the cherry on top, chains are displayed in the Batch API as
> {code} CHAIN <operator> -> <operator> {code}
> while in the Streaming API we lost the CHAIN keyword
> {code} <operator> -> <operator> {code}
> Considering that these names are right in the users face via the Dashboard we should try to homogenize them a bit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)