You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zeppelin.apache.org by AlexanderShoshin <gi...@git.apache.org> on 2016/12/14 08:11:15 UTC

[GitHub] zeppelin pull request #1758: [ZEPPELIN-1787] Add an example of Flink Noteboo...

GitHub user AlexanderShoshin opened a pull request:

    https://github.com/apache/zeppelin/pull/1758

    [ZEPPELIN-1787] Add an example of Flink Notebook

    ### What is this PR for?
    This PR will add an example of batch processing with Flink to Zeppelin tutorial notebooks. There are no any Flink notebooks in the tutorial at the moment.
    
    ### What type of PR is it?
    Improvement
    
    ### What is the Jira issue?
    [ZEPPELIN-1787](https://issues.apache.org/jira/browse/ZEPPELIN-1787)
    
    ### How should this be tested?
    You should open `Using Flink for batch processing` notebook from the `Zeppelin Tutorial` folder and run all paragraphs one by one
    
    ### Questions:
    * Does the licenses files need update? - **no**
    * Is there breaking changes for older versions? - **no**
    * Does this needs documentation? - **no**
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/AlexanderShoshin/zeppelin ZEPPELIN-1787

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/zeppelin/pull/1758.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1758
    
----
commit f64b60a15e2711746745df6cc78690b2fdcbe6e2
Author: Alexander Shoshin <al...@epam.com>
Date:   2016-12-13T12:52:20Z

    [ZEPPELIN-1787] Add an example of Flink Notebook

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] zeppelin issue #1758: [ZEPPELIN-1787] Add an example of Flink Notebook

Posted by AlexanderShoshin <gi...@git.apache.org>.
Github user AlexanderShoshin commented on the issue:

    https://github.com/apache/zeppelin/pull/1758
  
    @AhyoungRyu, thank you for your notes!
    * I've changed `wget` to `curl` as you suggested.
    * As for the error message with "permission denied" I think it is because `%sh` paragraphs with data download instructions did not finish correct. Each paragraph downloads for about 70 MB of data and unpack it then. They might have finished by timeout. In this case you don't have the `/tmp/flights98.csv` file at all. A added a recomendation to the notebook to increase `shell.command.timeout.millisecs` setting. It helped in my case.
    * We can't store data sets in "home" folder because `%sh` and `%flink` may be run by different users. So they may have different home folders. The `/tmp/` folder is a common folder which normally can be accessed by each user. For the case of limited access I've added `chmod 666 /tmp/flights<YY>.csv` command for each *csv* set. Maybe there is a better solution for this issue. I will be glad to receive any suggestions :)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] zeppelin issue #1758: [ZEPPELIN-1787] Add an example of Flink Notebook

Posted by tae-jun <gi...@git.apache.org>.
Github user tae-jun commented on the issue:

    https://github.com/apache/zeppelin/pull/1758
  
    [Link for the note on ZeppelinHub](https://www.zeppelinhub.com/viewer/notebooks/aHR0cHM6Ly9yYXcuZ2l0aHVidXNlcmNvbnRlbnQuY29tL0FsZXhhbmRlclNob3NoaW4vemVwcGVsaW4tZXhhbXBsZS9tYXN0ZXIvbm90ZWJvb2tzL0ZsaWdodHMuanNvbg)
    
    For convenience :)
    
    @AlexanderShoshin Is this link right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] zeppelin issue #1758: [ZEPPELIN-1787] Add an example of Flink Notebook

Posted by AhyoungRyu <gi...@git.apache.org>.
Github user AhyoungRyu commented on the issue:

    https://github.com/apache/zeppelin/pull/1758
  
    Thanks for your quick update! 
    Since #1780 is trying to update the all tutorial notes' format to the latest one, having older format of note can't be best solution I think.  Not sure, but as you said, I assume there is another problem \w new note json format itself.(If so, we need to fix this. But in another PR :D ) #1780 also has same problem as I left [a comment in there](https://github.com/apache/zeppelin/pull/1780#issuecomment-267756964). 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] zeppelin issue #1758: [ZEPPELIN-1787] Add an example of Flink Notebook

Posted by AlexanderShoshin <gi...@git.apache.org>.
Github user AlexanderShoshin commented on the issue:

    https://github.com/apache/zeppelin/pull/1758
  
    @AhyoungRyu, thanks for the note. I've removed an url to the notebook.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] zeppelin issue #1758: [ZEPPELIN-1787] Add an example of Flink Notebook

Posted by AhyoungRyu <gi...@git.apache.org>.
Github user AhyoungRyu commented on the issue:

    https://github.com/apache/zeppelin/pull/1758
  
    Thank you for the such precise explanation. Tested again and it works well! 
    BTW some markdown paragraphs are not shown like below, so i didn't noticed that there are some description in here(I need to click "show editor" to check what it contains). 
    
    ![screen shot 2016-12-16 at 2 43 53 pm](https://cloud.githubusercontent.com/assets/10060731/21253128/6ea08af2-c39f-11e6-94ea-155fbb7f894e.png)
    
    It would be better to show the other description paragraph's result by default like first paragraph does :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] zeppelin issue #1758: [ZEPPELIN-1787] Add an example of Flink Notebook

Posted by rawkintrevo <gi...@git.apache.org>.
Github user rawkintrevo commented on the issue:

    https://github.com/apache/zeppelin/pull/1758
  
    @AlexanderShoshin I think what @zjffdu and @bzz meant was that you could possibly call out this notebook in some where like [`docs/interpreter/flink.md`](https://github.com/apache/zeppelin/blob/master/docs/interpreter/flink.md)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] zeppelin issue #1758: [ZEPPELIN-1787] Add an example of Flink Notebook

Posted by AlexanderShoshin <gi...@git.apache.org>.
Github user AlexanderShoshin commented on the issue:

    https://github.com/apache/zeppelin/pull/1758
  
    I've converted notebook to 0.6.2 format. So it should be displayed correctly now.
    [ZeppelinHub vew](https://www.zeppelinhub.com/viewer/notebooks/aHR0cHM6Ly9yYXcuZ2l0aHVidXNlcmNvbnRlbnQuY29tL0FsZXhhbmRlclNob3NoaW4vemVwcGVsaW4vWkVQUEVMSU4tMTc4Ny9ub3RlYm9vay8yQzM1WVU4MTQvbm90ZS5qc29u) also works.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] zeppelin issue #1758: [ZEPPELIN-1787] Add an example of Flink Notebook

Posted by bzz <gi...@git.apache.org>.
Github user bzz commented on the issue:

    https://github.com/apache/zeppelin/pull/1758
  
    Is there any other feedback, or was everything addressed and we shall merge it now?
    
    Saw @zjffdu 's
    >  How about refer this note in flink.md ?
    
    @AlexanderShoshin do you think it's worth to update the interpreter documentation as well under `./docs`, to mention this example?
    
    Other than that, looks great to me! 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] zeppelin issue #1758: [ZEPPELIN-1787] Add an example of Flink Notebook

Posted by AhyoungRyu <gi...@git.apache.org>.
Github user AhyoungRyu commented on the issue:

    https://github.com/apache/zeppelin/pull/1758
  
    @AlexanderShoshin Checked again and it looks nice. Except the minor suggestion, LGTM. Thanks for your effort and the awesome tutorial note! 



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] zeppelin issue #1758: [ZEPPELIN-1787] Add an example of Flink Notebook

Posted by AhyoungRyu <gi...@git.apache.org>.
Github user AhyoungRyu commented on the issue:

    https://github.com/apache/zeppelin/pull/1758
  
    @AlexanderShoshin Seems the below two things need to be done before merge  
    1. revert the last commit [9013620](https://github.com/apache/zeppelin/pull/1758/commits/9013620e1c835c66a849467e1ea15d43fa2a7782)
    
    You can simply do it in your local by
    ```
    please make sure you are on branch ZEPPELIN-1787 first 
    $ git checkout ZEPPELIN-1787
    
    and check the last commit is "convert notebook to 0.6.2 format" or not
    $ git log 
    
    if so,
    $ git reset HEAD^ --hard 
    
    check again it's converted well, it should be pointing this commit: "add download instruction, change "wget" to "curl""
    $ git log
    
    push with forced updated! 
    $ git push origin -f ZEPPELIN-1787
    ```
    
    2. add Flink docs link to the tutorial as @rawkintrevo said. 
    
    Please feel free to ping me if you need any help! 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] zeppelin issue #1758: [ZEPPELIN-1787] Add an example of Flink Notebook

Posted by AlexanderShoshin <gi...@git.apache.org>.
Github user AlexanderShoshin commented on the issue:

    https://github.com/apache/zeppelin/pull/1758
  
    I've added a new commit to convert the notebook to 0.7.0 format because I found that [9013620](https://github.com/apache/zeppelin/pull/1758/commits/9013620e1c835c66a849467e1ea15d43fa2a7782)  commit has several conflicts in json-file.
    I've also merged master to avoid conflicts in `flink.md`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] zeppelin issue #1758: [ZEPPELIN-1787] Add an example of Flink Notebook

Posted by AhyoungRyu <gi...@git.apache.org>.
Github user AhyoungRyu commented on the issue:

    https://github.com/apache/zeppelin/pull/1758
  
    @AlexanderShoshin Thanks for your contribution! 
    While I was just quickly looking through this example note in my Zeppelin, couple of things were noticed. 
     - AFAIK, OSX doesn't have `wget` by default so the OSX users might need to install by themselves. So I would suggest you to use `curl` instead of `wget` to download datasets. (`curl` is built-in command I guess)
    
     - I saw you set the location of dataset to `tmp/`. It can occurs sth link "permission denied" error for that dir in Zeppelin like below. 
    ```
    Caused by: org.apache.flink.runtime.JobException: 
    Creating the input splits caused an error: 
    File /tmp/flights98.csv does not exist or the user running Flink ('ahyoungryu') has insufficient permissions to access it.
    ```



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] zeppelin issue #1758: [ZEPPELIN-1787] Add an example of Flink Notebook

Posted by AlexanderShoshin <gi...@git.apache.org>.
Github user AlexanderShoshin commented on the issue:

    https://github.com/apache/zeppelin/pull/1758
  
    @zjffdu, should we do this in current PR or make a new issue for the documentation improving?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] zeppelin issue #1758: [ZEPPELIN-1787] Add an example of Flink Notebook

Posted by AhyoungRyu <gi...@git.apache.org>.
Github user AhyoungRyu commented on the issue:

    https://github.com/apache/zeppelin/pull/1758
  
    @AlexanderShoshin Thanks! Will Merge if there are no more comments on this.  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] zeppelin issue #1758: [ZEPPELIN-1787] Add an example of Flink Notebook

Posted by AlexanderShoshin <gi...@git.apache.org>.
Github user AlexanderShoshin commented on the issue:

    https://github.com/apache/zeppelin/pull/1758
  
    I am confused a bit :)
    @AhyoungRyu, do I need to drop my last commit or not?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] zeppelin issue #1758: [ZEPPELIN-1787] Add an example of Flink Notebook

Posted by AlexanderShoshin <gi...@git.apache.org>.
Github user AlexanderShoshin commented on the issue:

    https://github.com/apache/zeppelin/pull/1758
  
    Yes, but ZeppelinHub does not show all the paragraphs. I will try to find out the reason.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] zeppelin issue #1758: [ZEPPELIN-1787] Add an example of Flink Notebook

Posted by AlexanderShoshin <gi...@git.apache.org>.
Github user AlexanderShoshin commented on the issue:

    https://github.com/apache/zeppelin/pull/1758
  
    Sorry, I was not able to work on this issue during last weeks.
    @AhyoungRyu, I will make the corrections soon.
    Thanks


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] zeppelin pull request #1758: [ZEPPELIN-1787] Add an example of Flink Noteboo...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/zeppelin/pull/1758


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] zeppelin issue #1758: [ZEPPELIN-1787] Add an example of Flink Notebook

Posted by AhyoungRyu <gi...@git.apache.org>.
Github user AhyoungRyu commented on the issue:

    https://github.com/apache/zeppelin/pull/1758
  
    @AlexanderShoshin Right that would be better. So sorry about that \u3160_\u3160 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] zeppelin issue #1758: [ZEPPELIN-1787] Add an example of Flink Notebook

Posted by AlexanderShoshin <gi...@git.apache.org>.
Github user AlexanderShoshin commented on the issue:

    https://github.com/apache/zeppelin/pull/1758
  
    Oh, it should not looks like this. It might be another problem of new notebook json structure. I will correct this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] zeppelin issue #1758: [ZEPPELIN-1787] Add an example of Flink Notebook

Posted by AlexanderShoshin <gi...@git.apache.org>.
Github user AlexanderShoshin commented on the issue:

    https://github.com/apache/zeppelin/pull/1758
  
    @tae-jun, I found that **ZeppelinHub** can't display new notebooks correctly. It is because the note.json structure was changed after [ZEPPELIN-212](https://issues.apache.org/jira/browse/ZEPPELIN-212) was merged (two weeks ago). Now notebook has a `results` attribute (instead of `result`) to store paragraph results and it seems that **ZeppelinHub** can't see it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] zeppelin issue #1758: [ZEPPELIN-1787] Add an example of Flink Notebook

Posted by AlexanderShoshin <gi...@git.apache.org>.
Github user AlexanderShoshin commented on the issue:

    https://github.com/apache/zeppelin/pull/1758
  
    @bzz, I am not sure. We already have a *Word Count* example there to describe `Flink` usage. Where should we place the link to this example?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] zeppelin issue #1758: [ZEPPELIN-1787] Add an example of Flink Notebook

Posted by zjffdu <gi...@git.apache.org>.
Github user zjffdu commented on the issue:

    https://github.com/apache/zeppelin/pull/1758
  
    How about refer this note in Flink.md ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] zeppelin issue #1758: [ZEPPELIN-1787] Add an example of Flink Notebook

Posted by AhyoungRyu <gi...@git.apache.org>.
Github user AhyoungRyu commented on the issue:

    https://github.com/apache/zeppelin/pull/1758
  
    @AlexanderShoshin Actually it was my bad. I checked again and [this commit](https://github.com/apache/zeppelin/pull/1758/commits/fe2a39ec38ad2aed4b65e37c96e7b1dfd3f3489b) is perfectly working. Please ignore my last comment and sorry for the confusion. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---