You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by mxm <gi...@git.apache.org> on 2015/02/06 17:00:41 UTC

[GitHub] flink pull request: [FLINK-1486] add print method for prefixing a ...

GitHub user mxm opened a pull request:

    https://github.com/apache/flink/pull/372

    [FLINK-1486] add print method for prefixing a user defined string

    - extend API to include a `print(String message)` method
    - change `PrintingOutputformat` to include a message

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/mxm/flink flink-1486

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/372.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #372
    
----
commit 62c7520d821f8a0718fe7fc3daca6fea09546c2b
Author: Max <ma...@posteo.de>
Date:   2015-02-06T15:55:29Z

    [FLINK-1486] add print method for prefixing a user defined string
    
    - extend API to include a print(String message) method
    - change PrintingOutputformat to include a message

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1486] add print method for prefixing a ...

Posted by rmetzger <gi...@git.apache.org>.
Github user rmetzger commented on the pull request:

    https://github.com/apache/flink/pull/372#issuecomment-73270614
  
    Looks good.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1486] add print method for prefixing a ...

Posted by mxm <gi...@git.apache.org>.
Github user mxm commented on the pull request:

    https://github.com/apache/flink/pull/372#issuecomment-73504724
  
    To make it a bit more explicit what is sink identifier and what is the task identifier (especially when just one of the two are printed), I prefixed the sink identifier with "sinkId" and the task identifier with taskId.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1486] add print method for prefixing a ...

Posted by StephanEwen <gi...@git.apache.org>.
Github user StephanEwen commented on the pull request:

    https://github.com/apache/flink/pull/372#issuecomment-95580149
  
    Can you think of a case where printing on the client is worse than printing on worker?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1486] add print method for prefixing a ...

Posted by mxm <gi...@git.apache.org>.
Github user mxm commented on the pull request:

    https://github.com/apache/flink/pull/372#issuecomment-94502825
  
    I've updated the pull request. I decided to implement the concise method:
    ```
    sinkId:taskId> output  <- sink id provided, parallelism > 1
    sinkId> output         <- sink id provided, parallelism == 1
    taskId> output         <- no sink id provided, parallelism > 1
    output                 <- no sink id provided, parallelism == 1
    ```
    If no objections, I will merge this tomorrow.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1486] add print method for prefixing a ...

Posted by mxm <gi...@git.apache.org>.
Github user mxm commented on the pull request:

    https://github.com/apache/flink/pull/372#issuecomment-94703910
  
    I've added documentation for the new print method. Will merge later on.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1486] add print method for prefixing a ...

Posted by mxm <gi...@git.apache.org>.
Github user mxm commented on the pull request:

    https://github.com/apache/flink/pull/372#issuecomment-95583993
  
    @StephanEwen Don't think printing on the client can be worse if the output still contains information about the producers (e.g. by a task id). IMO, a sink identifier could still make sense when you make multiple calls to print and want to distinguish easily between the outputs. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1486] add print method for prefixing a ...

Posted by mxm <gi...@git.apache.org>.
Github user mxm commented on the pull request:

    https://github.com/apache/flink/pull/372#issuecomment-73410628
  
    Good idea. So we would print `$taskId > $outputValue` if the user did not supply a string and `$string:$taskId > $outputValue` otherwise. If the parallelization degree is 1, we would just print `$string > $outputValue` if a string was supplied.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1486] add print method for prefixing a ...

Posted by uce <gi...@git.apache.org>.
Github user uce commented on the pull request:

    https://github.com/apache/flink/pull/372#issuecomment-73853743
  
    I think this is very valuable. I've tried it out and it looks good to me.
    
    Personally, I would prefer the shorter versions proposed by Fabian. If we don't differentiate between parallelism 1 and > 1 we wouldn't have to worry about cases where just one is printed.
    
    But I'm fine with your current solution as well.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1486] add print method for prefixing a ...

Posted by fhueske <gi...@git.apache.org>.
Github user fhueske commented on the pull request:

    https://github.com/apache/flink/pull/372#issuecomment-73879164
  
    +1 for conciseness. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1486] add print method for prefixing a ...

Posted by StephanEwen <gi...@git.apache.org>.
Github user StephanEwen commented on the pull request:

    https://github.com/apache/flink/pull/372#issuecomment-95869290
  
    Okay, let's just try and not make this too confusing for users. Do we need all three versions?
    
    1. `print()`
    2. `printWithParallelId()`
    3. `printWithPrefix()`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1486] add print method for prefixing a ...

Posted by StephanEwen <gi...@git.apache.org>.
Github user StephanEwen commented on the pull request:

    https://github.com/apache/flink/pull/372#issuecomment-95151943
  
    This may come a bit late (given that this is merged now), but I did not think of it before:
    
    When we change the printing to happen on the client console (which we should, IMHO), we will probably realize it via `collect().foreach(print)` or so. In that case, there it gets executed immediately and there is always only one sink active.
    
    Does this change still make sense then?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1486] add print method for prefixing a ...

Posted by StephanEwen <gi...@git.apache.org>.
Github user StephanEwen commented on the pull request:

    https://github.com/apache/flink/pull/372#issuecomment-95586431
  
    Ah, okay. You mean we have two methods:
    
    1.  `print()` which just prints everything onto the client
    2.  `printWithTaskid()`, which prefixes lines with the task ID?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1486] add print method for prefixing a ...

Posted by fhueske <gi...@git.apache.org>.
Github user fhueske commented on the pull request:

    https://github.com/apache/flink/pull/372#issuecomment-73387496
  
    I think it would be nice to have some kind of hierarchical structure of the output such as:
    `$sinkName:$taskId > $outputValue`
    That would give the name of the sink first followed by the sink's task id, and finally behind the `>`prompt the actual output. Wouldn't that also make output parsing easier?
    
    Looks good otherwise.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1486] add print method for prefixing a ...

Posted by mxm <gi...@git.apache.org>.
Github user mxm commented on the pull request:

    https://github.com/apache/flink/pull/372#issuecomment-95880258
  
    Yes, it should be simple for the user. It makes sense to have one print method which just prints the output on the client. In addition, we could have another *advanced* print method which prints a prefix and optionally the task id.
    
    - `print()`
    - `print(String prefix, boolean includeParallelID)`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1486] add print method for prefixing a ...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/flink/pull/372


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1486] add print method for prefixing a ...

Posted by fhueske <gi...@git.apache.org>.
Github user fhueske commented on the pull request:

    https://github.com/apache/flink/pull/372#issuecomment-73417227
  
    Sounds good to me!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1486] add print method for prefixing a ...

Posted by StephanEwen <gi...@git.apache.org>.
Github user StephanEwen commented on the pull request:

    https://github.com/apache/flink/pull/372#issuecomment-95580456
  
    From what I have seen, most people expect print() to actually go to the client.
    It would be a break of backwards compatibility, agreed. Maybe one of the few that are necessary.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1486] add print method for prefixing a ...

Posted by mxm <gi...@git.apache.org>.
Github user mxm commented on the pull request:

    https://github.com/apache/flink/pull/372#issuecomment-95488183
  
    Do we want to break backwards compatibility or include a new method for printing on the client? After all, printing on the workers is a useful tool to debug the dataflow of a program.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1486] add print method for prefixing a ...

Posted by fhueske <gi...@git.apache.org>.
Github user fhueske commented on the pull request:

    https://github.com/apache/flink/pull/372#issuecomment-95153627
  
    I think you are right. If there's only one sink active, there is no need for a sink identifier.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1486] add print method for prefixing a ...

Posted by mxm <gi...@git.apache.org>.
Github user mxm commented on the pull request:

    https://github.com/apache/flink/pull/372#issuecomment-95855310
  
    Just saying that a prefix helps to identify output, even if everything is printed on the client. Additionally, including the task id in the output can be useful for debugging.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1486] add print method for prefixing a ...

Posted by fhueske <gi...@git.apache.org>.
Github user fhueske commented on the pull request:

    https://github.com/apache/flink/pull/372#issuecomment-94590741
  
    LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---