You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zeppelin.apache.org by "Ahyoung (JIRA)" <ji...@apache.org> on 2015/11/23 04:01:10 UTC
[jira] [Created] (ZEPPELIN-457) Add documentation about Spark on
EMR using Zeppelin Sandbox
Ahyoung created ZEPPELIN-457:
--------------------------------
Summary: Add documentation about Spark on EMR using Zeppelin Sandbox
Key: ZEPPELIN-457
URL: https://issues.apache.org/jira/browse/ZEPPELIN-457
Project: Zeppelin
Issue Type: Improvement
Reporter: Ahyoung
Assignee: Ahyoung
Priority: Minor
Nowadays many people is using Spark on AWS EMR clusters.
So, it would be helpful for the users if Zeppelin provides a step by step guide documentation.
This documentation may include below contents.
- How to create clusters and install "Zeppelin-Sandbox".
- Establishing a connection to the master node using SSH.
- How can we browse web interfaces hosted on our clusters that we made ? (How to set up a SSH tunnel to the master node using Local / Dynamic port forwarding)
- Some information about predefined Zeppelin-Sandbox environment variables( such as Zeppelin itself, log and notebook directory locations in the master node), Hadoop, Spark, Zeppelin service port number and etc ..
- Tutorials for beginners like attached image.
!https://github.com/AhyoungRyu/Platform-Documentation/blob/master/img/SparkDataframe.png?raw=true!
!https://github.com/AhyoungRyu/Platform-Documentation/blob/master/img/Result.png?raw=true!
Any ideas are welcome !
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)