You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@zeppelin.apache.org by "Ahyoung (JIRA)" <ji...@apache.org> on 2015/11/23 04:01:10 UTC

[jira] [Created] (ZEPPELIN-457) Add documentation about Spark on EMR using Zeppelin Sandbox

Ahyoung created ZEPPELIN-457:
--------------------------------

             Summary: Add documentation about Spark on EMR using Zeppelin Sandbox
                 Key: ZEPPELIN-457
                 URL: https://issues.apache.org/jira/browse/ZEPPELIN-457
             Project: Zeppelin
          Issue Type: Improvement
            Reporter: Ahyoung
            Assignee: Ahyoung
            Priority: Minor


Nowadays many people is using Spark on AWS EMR clusters. 
So, it would be helpful for the users if Zeppelin provides a step by step guide documentation. 

This documentation may include below contents.
 - How to create clusters and install "Zeppelin-Sandbox".
 - Establishing a connection to the master node using SSH.
 - How can we browse web interfaces hosted on our clusters that we made ? (How to set up a SSH tunnel to the master node using Local / Dynamic port forwarding)
 - Some information about predefined Zeppelin-Sandbox environment variables( such as Zeppelin itself, log and notebook directory locations in the master node), Hadoop, Spark, Zeppelin service port number and etc ..
 - Tutorials for beginners like attached image.
!https://github.com/AhyoungRyu/Platform-Documentation/blob/master/img/SparkDataframe.png?raw=true!
!https://github.com/AhyoungRyu/Platform-Documentation/blob/master/img/Result.png?raw=true!

Any ideas are welcome !




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)