You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@zeppelin.apache.org by mo...@apache.org on 2017/12/14 16:46:25 UTC

svn commit: r1818166 [4/6] - in /zeppelin/site/docs/0.8.0-SNAPSHOT: ./ assets/themes/zeppelin/img/ assets/themes/zeppelin/img/docs-img/ assets/themes/zeppelin/img/screenshots/ development/ development/contribution/ development/helium/ interpreter/ quic...

Modified: zeppelin/site/docs/0.8.0-SNAPSHOT/search_data.json
URL: http://svn.apache.org/viewvc/zeppelin/site/docs/0.8.0-SNAPSHOT/search_data.json?rev=1818166&r1=1818165&r2=1818166&view=diff
==============================================================================
--- zeppelin/site/docs/0.8.0-SNAPSHOT/search_data.json (original)
+++ zeppelin/site/docs/0.8.0-SNAPSHOT/search_data.json Thu Dec 14 16:46:23 2017
@@ -1,169 +1,144 @@
 {
   
-  
-  
-
-    "/development/contribution/how_to_contribute_code.html": {
-      "title": "Contributing to Apache Zeppelin (Code)",
-      "content"  : "<!--Licensed under the Apache License, Version 2.0 (the "License");you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License.-->Contributing to Apache Zeppelin ( Code )NOTE : Apache Zeppelin is an Apache2 License Software.Any contributions to Zeppelin (Source code, Documents, Image, Website) means you agree with license all your contributions as Apache2 License.Setting upHere are some tools you will need to build and test Zeppelin.Software Configuration Management ( SCM )Since Zeppelin uses Git for it's SCM system, you need git client installed in y
 our development machine.Integrated Development Environment ( IDE )You are free to use whatever IDE you prefer, or your favorite command line editor.Build ToolsTo build the code, installOracle Java 7Apache MavenGetting the source codeFirst of all, you need Zeppelin source code. The official location of Zeppelin is http://git.apache.org/zeppelin.git.git accessGet the source code on your development machine using git.git clone git://git.apache.org/zeppelin.git zeppelinYou may also want to develop against a specific branch. For example, for branch-0.5.6git clone -b branch-0.5.6 git://git.apache.org/zeppelin.git zeppelinApache Zeppelin follows Fork & Pull as a source control workflow.If you want to not only build Zeppelin but also make any changes, then you need to fork Zeppelin github mirror repository and make a pull request.Before making a pull request, please take a look Contribution Guidelines.Buildmvn installTo skip testmvn install -DskipTestsTo build with specific spark / 
 hadoop versionmvn install -Dspark.version=x.x.x -Dhadoop.version=x.x.xFor the further Run Zeppelin server in development modecd zeppelin-serverHADOOP_HOME=YOUR_HADOOP_HOME JAVA_HOME=YOUR_JAVA_HOME mvn exec:java -Dexec.mainClass="org.apache.zeppelin.server.ZeppelinServer" -Dexec.args=""Note: Make sure you first run mvn clean install -DskipTests on your zeppelin root directory, otherwise your server build will fail to find the required dependencies in the local repro.or use daemon scriptbin/zeppelin-daemon startServer will be run on http://localhost:8080.Generating Thrift CodeSome portions of the Zeppelin code are generated by Thrift. For most Zeppelin changes, you don't need to worry about this. But if you modify any of the Thrift IDL files (e.g. zeppelin-interpreter/src/main/thrift/*.thrift), then you also need to regenerate these files and submit their updated version as part of your patch.To regenerate the code, install thrift-0.9.2 and 
 then run the following command to generate thrift code.cd <zeppelin_home>/zeppelin-interpreter/src/main/thrift./genthrift.shRun Selenium testZeppelin has set of integration tests using Selenium. To run these test, first build and run Zeppelin and make sure Zeppelin is running on port 8080. Then you can run test using following commandTEST_SELENIUM=true mvn test -Dtest=[TEST_NAME] -DfailIfNoTests=false -pl 'zeppelin-interpreter,zeppelin-zengine,zeppelin-server'For example, to run ParagraphActionIT,TEST_SELENIUM=true mvn test -Dtest=ParagraphActionsIT -DfailIfNoTests=false -pl 'zeppelin-interpreter,zeppelin-zengine,zeppelin-server'You'll need Firefox web browser installed in your development environment. While CI server uses Firefox 31.0 to run selenium test, it is good idea to install the same version (disable auto update to keep the version).Where to StartYou can find issues for beginner & newbieStay involvedContributors 
 should join the Zeppelin mailing lists.dev@zeppelin.apache.org is for people who want to contribute code to Zeppelin. subscribe, unsubscribe, archivesIf you have any issues, create a ticket in JIRA.",
-      "url": " /development/contribution/how_to_contribute_code.html",
-      "group": "development/contribution",
-      "excerpt": "How can you contribute to Apache Zeppelin project? This document covers from setting up your develop environment to making a pull request on Github."
-    }
-    ,
-    
-  
-
-    "/development/contribution/how_to_contribute_website.html": {
-      "title": "Contributing to Apache Zeppelin (Website)",
-      "content"  : "<!--Licensed under the Apache License, Version 2.0 (the "License");you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License.-->Contributing to Apache Zeppelin ( Website )This page will give you an overview of how to build and contribute to the documentation of Apache Zeppelin.The online documentation at zeppelin.apache.org is also generated from the files found here.NOTE : Apache Zeppelin is an Apache2 License Software.Any contributions to Zeppelin (Source code, Documents, Image, Website) means you agree with license all your contributions as Apache2 Licen
 se.Getting the source codeFirst of all, you need Zeppelin source code. The official location of Zeppelin is http://git.apache.org/zeppelin.git.Documentation website is hosted in 'master' branch under /docs/ dir.git accessFirst of all, you need the website source code. The official location of mirror for Zeppelin is http://git.apache.org/zeppelin.git.Get the source code on your development machine using git.git clone git://git.apache.org/zeppelin.gitcd docsApache Zeppelin follows Fork & Pull as a source control workflow.If you want to not only build Zeppelin but also make any changes, then you need to fork Zeppelin github mirror repository and make a pull request.BuildYou'll need to install some prerequisites to build the code. Please check Build documentation section in docs/README.md.Run website in development modeWhile you're modifying website, you might want to see preview of it. Please check Run website section in docs/README.md.Then you&a
 mp;#39;ll be able to access it on http://localhost:4000 with your web browser.Making a Pull RequestWhen you are ready, just make a pull-request.Alternative wayYou can directly edit .md files in /docs/ directory at the web interface of github and make pull-request immediately.Stay involvedContributors should join the Zeppelin mailing lists.dev@zeppelin.apache.org is for people who want to contribute code to Zeppelin. subscribe, unsubscribe, archivesIf you have any issues, create a ticket in JIRA.",
-      "url": " /development/contribution/how_to_contribute_website.html",
-      "group": "development/contribution",
-      "excerpt": "How can you contribute to Apache Zeppelin project website? This document covers from building Zeppelin documentation site to making a pull request on Github."
-    }
-    ,
-    
-  
 
-    "/development/contribution/useful_developer_tools.html": {
-      "title": "Useful Developer Tools",
-      "content"  : "<!--Licensed under the Apache License, Version 2.0 (the "License");you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License.-->Useful Developer ToolsDeveloping zeppelin-webCheck zeppelin-web: Local Development.ToolsSVM: Scala Version Managersvm would be useful when changing scala version frequently.JDK change script: OSXthis script would be helpful when changing JDK version frequently.function setjdk() {  if [ $# -ne 0 ]; then  # written based on OSX.   # use diffrent base path for other OS  removeFromPath '/System/Library/Frameworks/JavaVM.framewo
 rk/Home/bin'  if [ -n "${JAVA_HOME+x}" ]; then    removeFromPath $JAVA_HOME  fi  export JAVA_HOME=`/usr/libexec/java_home -v $@`  export PATH=$JAVA_HOME/bin:$PATH  fi}function removeFromPath() {  export PATH=$(echo $PATH | sed -E -e "s;:$1;;" -e "s;$1:?;;")}you can use this function like setjdk 1.8 / setjdk 1.7Building Submodules Selectively# build `zeppelin-web` onlymvn clean -pl 'zeppelin-web' package -DskipTests;# build `zeppelin-server` and its dependencies onlymvn clean package -pl 'spark,spark-dependencies,python,markdown,zeppelin-server' --am -DskipTests# build spark related modules with default profiles: scala 2.10 mvn clean package -pl 'spark,spark-dependencies,zeppelin-server' --am -DskipTests# build spark related modules with profiles: scala 2.11, spark 2.1 hadoop 2.7 ./dev/change_scala_version.sh 2.11mvn clean package -Pspark-2.1 -Phadoop-2.7 -Pscala-2.11 -pl &am
 p;#39;spark,spark-dependencies,zeppelin-server' --am -DskipTests# build `zeppelin-server` and `markdown` with dependenciesmvn clean package -pl 'markdown,zeppelin-server' --am -DskipTestsRunning Individual Tests# run the `HeliumBundleFactoryTest` test classmvn test -pl 'zeppelin-server' --am -DfailIfNoTests=false -Dtest=HeliumBundleFactoryTestRunning Selenium TestsMake sure that Zeppelin instance is started to execute integration tests (= selenium tests).# run the `SparkParagraphIT` test classTEST_SELENIUM="true" mvn test -pl 'zeppelin-server' --am -DfailIfNoTests=false -Dtest=SparkParagraphIT# run the `testSqlSpark` test function only in the `SparkParagraphIT` class# but note that, some test might be dependent on the previous testsTEST_SELENIUM="true" mvn test -pl 'zeppelin-server' --am -DfailIfNoTests=false -Dtest=SparkParagraphIT#testSqlSpark",
-      "url": " /development/contribution/useful_developer_tools.html",
-      "group": "development/contribution",
-      "excerpt": ""
+    "/interpreter/livy.html": {
+      "title": "Livy Interpreter for Apache Zeppelin",
+      "content"  : "<!--Licensed under the Apache License, Version 2.0 (the "License");you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License.-->Livy Interpreter for Apache ZeppelinOverviewLivy is an open source REST interface for interacting with Spark from anywhere. It supports executing snippets of code or programs in a Spark context that runs locally or in YARN.Interactive Scala, Python and R shellsBatch submissions in Scala, Java, PythonMulti users can share the same server (impersonation support)Can be used for submitting jobs from anywhere with RESTDoes not require a
 ny code change to your programsRequirementsAdditional requirements for the Livy interpreter are:Spark 1.3 or above.Livy server.ConfigurationWe added some common configurations for spark, and you can set any configuration you want.You can find all Spark configurations in here.And instead of starting property with spark. it should be replaced with livy.spark..Example: spark.driver.memory to livy.spark.driver.memory      Property    Default    Description        zeppelin.livy.url    http://localhost:8998    URL where livy server is running        zeppelin.livy.spark.sql.maxResult    1000    Max number of Spark SQL result to display.        zeppelin.livy.spark.sql.field.truncate    true    Whether to truncate field values longer than 20 characters or not        zeppelin.livy.session.create_timeout    120    Timeout in seconds for session creation        zeppelin.livy.displayAppInfo    true    Whether to display app info        zeppelin.livy.pull_status.interval.millis    1000    The int
 erval for checking paragraph execution status        livy.spark.driver.cores        Driver cores. ex) 1, 2.          livy.spark.driver.memory        Driver memory. ex) 512m, 32g.          livy.spark.executor.instances        Executor instances. ex) 1, 4.          livy.spark.executor.cores        Num cores per executor. ex) 1, 4.        livy.spark.executor.memory        Executor memory per worker instance. ex) 512m, 32g.        livy.spark.dynamicAllocation.enabled        Use dynamic resource allocation. ex) True, False.        livy.spark.dynamicAllocation.cachedExecutorIdleTimeout        Remove an executor which has cached data blocks.        livy.spark.dynamicAllocation.minExecutors        Lower bound for the number of executors.        livy.spark.dynamicAllocation.initialExecutors        Initial number of executors to run.        livy.spark.dynamicAllocation.maxExecutors        Upper bound for the number of executors.            livy.spark.jars.packages            Adding extra libr
 aries to livy interpreter          zeppelin.livy.ssl.trustStore        client trustStore file. Used when livy ssl is enabled        zeppelin.livy.ssl.trustStorePassword        password for trustStore file. Used when livy ssl is enabled        zeppelin.livy.http.headers    key_1: value_1; key_2: value_2    custom http headers when calling livy rest api. Each http header is separated by `;`, and each header is one key value pair where key value is separated by `:`  We remove livy.spark.master in zeppelin-0.7. Because we sugguest user to use livy 0.3 in zeppelin-0.7. And livy 0.3 don't allow to specify livy.spark.master, it enfornce yarn-cluster mode.Adding External librariesYou can load dynamic library to livy interpreter by set livy.spark.jars.packages property to comma-separated list of maven coordinates of jars to include on the driver and executor classpaths. The format for the coordinates should be groupId:artifactId:version.Example      Property    Example    Description
           livy.spark.jars.packages      io.spray:spray-json_2.10:1.3.1      Adding extra libraries to livy interpreter      How to useBasically, you can usespark%livy.sparksc.versionpyspark%livy.pysparkprint "1"sparkR%livy.sparkrhello <- function( name ) {    sprintf( "Hello, %s", name );}hello("livy")ImpersonationWhen Zeppelin server is running with authentication enabled,then this interpreter utilizes Livy’s user impersonation featurei.e. sends extra parameter for creating and running a session ("proxyUser": "${loggedInUser}").This is particularly useful when multi users are sharing a Notebook server.Apply Zeppelin Dynamic FormsYou can leverage Zeppelin Dynamic Form. Form templates is only avalible for livy sql interpreter.%livy.sqlselect * from products where ${product_id=1}And creating dynamic formst programmatically is not feasible in livy interpreter, because ZeppelinContext i
 s not available in livy interpreter.FAQLivy debugging: If you see any of these in error consoleConnect to livyhost:8998 [livyhost/127.0.0.1, livyhost/0:0:0:0:0:0:0:1] failed: Connection refusedLooks like the livy server is not up yet or the config is wrongException: Session not found, Livy server would have restarted, or lost session.The session would have timed out, you may need to restart the interpreter.Blacklisted configuration values in session config: spark.masterEdit conf/spark-blacklist.conf file in livy server and comment out #spark.master line.If you choose to work on livy in apps/spark/java directory in https://github.com/cloudera/hue,copy spark-user-configurable-options.template to spark-user-configurable-options.conf file in livy server and comment out #spark.master.",
+      "url": " /interpreter/livy.html",
+      "group": "interpreter",
+      "excerpt": "Livy is an open source REST interface for interacting with Spark from anywhere. It supports executing snippets of code or programs in a Spark context that runs locally or in YARN."
     }
     ,
     
   
 
-    "/development/helium/overview.html": {
-      "title": "Helium",
-      "content"  : "<!--Licensed under the Apache License, Version 2.0 (the "License");you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License.-->Helium OverviewWhat is Helium?Helium is a plugin system that can extend Zeppelin a lot. For example, you can write custom display system or install already published one in Heliun Online Registry. Currently, Helium supports 4 types of package.Helium Visualization: Adding a new chart typeHelium Spell: Adding new interpreter, display system running on browserHelium Application Helium Interpreter: Adding a new custom interpreter",
-      "url": " /development/helium/overview.html",
-      "group": "development/helium",
-      "excerpt": ""
+    "/interpreter/pig.html": {
+      "title": "Pig Interpreter for Apache Zeppelin",
+      "content"  : "Pig Interpreter for Apache ZeppelinOverviewApache Pig is a platform for analyzing large data sets that consists of a high-level language for expressing data analysis programs, coupled with infrastructure for evaluating these programs. The salient property of Pig programs is that their structure is amenable to substantial parallelization, which in turns enables them to handle very large data sets.Supported interpreter type%pig.script (default Pig interpreter, so you can use %pig)%pig.script is like the Pig grunt shell. Anything you can run in Pig grunt shell can be run in %pig.script interpreter, it is used for running Pig script where you don’t need to visualize the data, it is suitable for data munging. %pig.query%pig.query is a little different compared with %pig.script. It is used for exploratory data analysis via Pig latin where you can leverage Zeppelin’s visualization ability. There're 2 minor differences in the last statement between %pig
 .script and %pig.queryNo pig alias in the last statement in %pig.query (read the examples below).The last statement must be in single line in %pig.queryHow to useHow to setup Pig execution modes.Local ModeSet zeppelin.pig.execType as local.MapReduce ModeSet zeppelin.pig.execType as mapreduce. HADOOP_CONF_DIR needs to be specified in ZEPPELIN_HOME/conf/zeppelin-env.sh.Tez Local ModeOnly Tez 0.7 is supported. Set zeppelin.pig.execType as tez_local.Tez ModeOnly Tez 0.7 is supported. Set zeppelin.pig.execType as tez. HADOOP_CONF_DIR and TEZ_CONF_DIR needs to be specified in ZEPPELIN_HOME/conf/zeppelin-env.sh.Spark Local ModeOnly Spark 1.6.x is supported, by default it is Spark 1.6.3. Set zeppelin.pig.execType as spark_local.Spark ModeOnly Spark 1.6.x is supported, by default it is Spark 1.6.3. Set zeppelin.pig.execType as spark. For now, only yarn-client mode is supported. To enable it, you need to set property SPARK_MASTER to yarn-client and set SPARK_JAR to the spark assembly jar.How 
 to choose custom Spark VersionBy default, Pig Interpreter would use Spark 1.6.3 built with scala 2.10, if you want to use another spark version or scala version, you need to rebuild Zeppelin by specifying the custom Spark version via -Dpig.spark.version= and scala version via -Dpig.scala.version= in the maven build command.How to configure interpreterAt the Interpreters menu, you have to create a new Pig interpreter. Pig interpreter has below properties by default.And you can set any Pig properties here which will be passed to Pig engine. (like tez.queue.name & mapred.job.queue.name).Besides, we use paragraph title as job name if it exists, else use the last line of Pig script. So you can use that to find app running in YARN RM UI.            Property        Default        Description                zeppelin.pig.execType        mapreduce        Execution mode for pig runtime. local | mapreduce | tez_local | tez | spark_local | spark                 zeppelin.pig.includeJobSta
 ts        false        whether display jobStats info in %pig.script                zeppelin.pig.maxResult        1000        max row number displayed in %pig.query                tez.queue.name        default        queue name for tez engine                mapred.job.queue.name        default        queue name for mapreduce engine                SPARK_MASTER        local        local | yarn-client                SPARK_JAR                The spark assembly jar, both jar in local or hdfs is supported. Put it on hdfs could have        performance benefit      Examplepig%pigbankText = load 'bank.csv' using PigStorage(';');bank = foreach bankText generate $0 as age, $1 as job, $2 as marital, $3 as education, $5 as balance; bank = filter bank by age != '"age"';bank = foreach bank generate (int)age, REPLACE(job,'"','') as job, REPLACE(marital, '"', '&a
 mp;#39;) as marital, (int)(REPLACE(balance, '"', '')) as balance;store bank into 'clean_bank.csv' using PigStorage(';'); -- this statement is optional, it just show you that most of time %pig.script is used for data munging before querying the data. pig.queryGet the number of each age where age is less than 30%pig.querybank_data = filter bank by age < 30;b = group bank_data by age;foreach b generate group, COUNT($1);The same as above, but use dynamic text form so that use can specify the variable maxAge in textbox. (See screenshot below). Dynamic form is a very cool feature of Zeppelin, you can refer this link) for details.%pig.querybank_data = filter bank by age < ${maxAge=40};b = group bank_data by age;foreach b generate group, COUNT($1) as count;Get the number of each age for specific marital type, also use dynamic form here. User can choose the marital type in the dropdown list (see screenshot
  below).%pig.querybank_data = filter bank by marital=='${marital=single,single|divorced|married}';b = group bank_data by age;foreach b generate group, COUNT($1) as count;The above examples are in the Pig tutorial note in Zeppelin, you can check that for details. Here's the screenshot.Data is shared between %pig and %pig.query, so that you can do some common work in %pig, and do different kinds of query based on the data of %pig. Besides, we recommend you to specify alias explicitly so that the visualization can display the column name correctly. In the above example 2 and 3 of %pig.query, we name COUNT($1) as count. If you don't do this, then we will name it using position. E.g. in the above first example of %pig.query, we will use col_1 in chart to represent COUNT($1).",
+      "url": " /interpreter/pig.html",
+      "group": "manual",
+      "excerpt": "Apache Pig is a platform for analyzing large data sets that consists of a high-level language for expressing data analysis programs, coupled with infrastructure for evaluating these programs."
     }
     ,
     
   
 
-    "/development/helium/writing_application.html": {
-      "title": "Writing a new Helium Application",
-      "content"  : "<!--Licensed under the Apache License, Version 2.0 (the "License");you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License.-->Writing a new ApplicationWhat is Apache Zeppelin ApplicationApache Zeppelin Application is a package that runs on Interpreter process and displays it's output inside of the notebook. While application runs on Interpreter process, it's able to access resources provided by Interpreter through ResourcePool. Output is always rendered by AngularDisplaySystem. Therefore application provides all the possiblities of making 
 interactive graphical application that uses data and processing power of any Interpreter.Make your own ApplicationWriting Application means extending org.apache.zeppelin.helium.Application. You can use your favorite IDE and language while Java class files are packaged into jar. Application class looks like/** * Constructor. Invoked when application is loaded */public Application(ApplicationContext context);/** * Invoked when there're (possible) updates in required resource set. * i.e. invoked after application load and after paragraph finishes. */public abstract void run(ResourceSet args);/** * Invoked before application unload. * Application is automatically unloaded with paragraph/notebook removal */public abstract void unload();You can check example applications under ./zeppelin-examples directory.Development modeIn the development mode, you can run your Application in your IDE as a normal java application and see the result inside of Zeppelin notebook.org.apache.zeppelin
 .helium.ZeppelinApplicationDevServer can run Zeppelin Application in development mode.// entry point for development modepublic static void main(String[] args) throws Exception {  // add resources for development mode  LocalResourcePool pool = new LocalResourcePool("dev");  pool.put("date", new Date());  // run application in devlopment mode with given resource  // in this case, Clock.class.getName() will be the application class name    org.apache.zeppelin.helium.ZeppelinApplicationDevServer devServer = new org.apache.zeppelin.helium.ZeppelinApplicationDevServer(    Clock.class.getName(), pool.getAll());  // start development mode  devServer.start();  devServer.join();}In the Zeppelin notebook, run %dev run will connect to application running in development mode.Package filePackage file is a json file that provides information about the application.Json file contains the following information{  "name" : "[organization].
 [name]",  "description" : "Description",  "artifact" : "groupId:artifactId:version",  "className" : "your.package.name.YourApplicationClass",  "resources" : [    ["resource.name", ":resource.class.name"],    ["alternative.resource.name", ":alternative.class.name"]  ],  "icon" : "<i class='icon'></i>"}nameName is a string in [group].[name] format.[group] and [name] allow only [A-Za-z0-9_].Group is normally the name of an organization who creates this application.descriptionA short description about the applicationartifactLocation of the jar artifact."groupId:artifactId:version" will load artifact from maven repository.If jar exists in the local filesystem, absolute/relative can be use
 d.e.g.When artifact exists in Maven repositoryartifact: "org.apache.zeppelin:zeppelin-examples:0.6.0"When artifact exists in the local filesystemartifact: "zeppelin-example/target/zeppelin-example-0.6.0.jar"classNameEntry point. Class that extends org.apache.zeppelin.helium.ApplicationresourcesTwo dimensional array that defines required resources by name or by className. Helium Application launcher will compare resources in the ResourcePool with the information in this field and suggest application only when all required resources are available in the ResourcePool.Resouce name is a string which will be compared with the name of objects in the ResourcePool. className is a string with ":" prepended, which will be compared with className of the objects in the ResourcePool.Application may require two or more resources. Required resources can be listed inside of the json array. For example, if the application requires object &quot
 ;name1", "name2" and "className1" type of object to run, resources field can beresources: [  [ "name1", "name2", ":className1", ...]]If Application can handle alternative combination of required resources, alternative set can be listed as below.resources: [  [ "name", ":className"],  [ "altName", ":altClassName1"],  ...]Easier way to understand this scheme isresources: [   [ 'resource' AND 'resource' AND ... ] OR   [ 'resource' AND 'resource' AND ... ] OR   ...]iconIcon to be used on the application button. String in this field will be rendered as a HTML tag.e.g.icon: "<i class='fa fa-clock-o'></i>"",
-      "url": " /development/helium/writing_application.html",
-      "group": "development/helium",
-      "excerpt": "Apache Zeppelin Application is a package that runs on Interpreter process and displays it's output inside of the notebook. Make your own Application in Apache Zeppelin is quite easy."
+    "/interpreter/markdown.html": {
+      "title": "Markdown Interpreter for Apache Zeppelin",
+      "content"  : "<!--Licensed under the Apache License, Version 2.0 (the "License");you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License.-->Markdown Interpreter for Apache ZeppelinOverviewMarkdown is a plain text formatting syntax designed so that it can be converted to HTML.Apache Zeppelin uses pegdown and markdown4j as markdown parsers.In Zeppelin notebook, you can use %md in the beginning of a paragraph to invoke the Markdown interpreter and generate static html from Markdown plain text.In Zeppelin, Markdown interpreter is enabled by default and uses the pegdown par
 ser.ExampleThe following example demonstrates the basic usage of Markdown in a Zeppelin notebook.Mathematical expressionMarkdown interpreter leverages %html display system internally. That means you can mix mathematical expressions with markdown syntax. For more information, please see Mathematical Expression section.Configuration      Name    Default Value    Description        markdown.parser.type    pegdown    Markdown Parser Type.  Available values: pegdown, markdown4j.  Pegdown Parserpegdown parser provides github flavored markdown.pegdown parser provides YUML and Websequence plugins also. Markdown4j ParserSince pegdown parser is more accurate and provides much more markdown syntaxmarkdown4j option might be removed later. But keep this parser for the backward compatibility.",
+      "url": " /interpreter/markdown.html",
+      "group": "interpreter",
+      "excerpt": "Markdown is a plain text formatting syntax designed so that it can be converted to HTML. Apache Zeppelin uses markdown4j."
     }
     ,
     
   
 
-    "/development/helium/writing_spell.html": {
-      "title": "Writing a new Helium Spell",
-      "content"  : "<!--Licensed under the Apache License, Version 2.0 (the "License");you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License.-->Writing a new SpellWhat is Apache Zeppelin SpellSpell is a kind of interpreter that runs on browser not on backend. So, technically it's the frontend interpreter.It can provide many benefits.Spell is pluggable frontend interpreter. So it can be installed and removed easily using helium registry.Every spell is written in javascript. It means you can use existing javascript libraries whatever you want.Spell runs on browser li
 ke display system (%html, %table). In other words, every spell can be used as display system as well.How it worksHelium Spell works like Helium Visualization.Every helium packages are loaded from central (online) registry or local registryYou can see loaded packages in /helium page.When you enable a spell, it's built from server and sent to clientFinally it will be loaded into browser.How to use spell1. EnablingFind a spell what you want to use in /helium package and click Enable button.2. UsingSpell works like an interpreter. Use the MAGIC value to execute spell in a note. (you might need to refresh after enabling)For example, Use %echo for the Echo Spell.Write a new SpellMaking a new spell is similar to Helium Visualization#write-new-visualization.Add framework dependency called zeppelin-spell into package.jsonWrite code using frameworkPublish your spell to npm1. Create a npm packageCreate a package.json in new directory for spell.You have to add a framework called zeppeli
 n-spell as a dependency to create spell (zeppelin-spell)Also, you can add any dependencies you want to utilise.Here's an example{  "name": "zeppelin-echo-spell",  "description": "Zeppelin Echo Spell (example)",  "version": "1.0.0",  "main": "index",  "author": "",  "license": "Apache-2.0",  "dependencies": {    "zeppelin-spell": "*"  },  "helium": {    "icon" : "<i class='fa fa-repeat'></i>",    "spell": {      "magic": "%echo",      "usage": "%echo <TEXT>"    }  }}2. Write spell using frameworkHere are some exa
 mples you can referEcho SpellMarkdown Spell: Using libraryFlowchart Spell: Using DOMGoogle Translation API Spell: Using API (returning promise)Now, you need to write code to create spell which processing text.import {    SpellBase,    SpellResult,    DefaultDisplayType,} from 'zeppelin-spell';export default class EchoSpell extends SpellBase {    constructor() {        /** pass magic to super class's constructor parameter */        super("%echo");    }    interpret(paragraphText) {        const processed = paragraphText + '!';        /**         * should return `SpellResult` which including `data` and `type`         * default type is `TEXT` if you don't specify.           */        return new SpellResult(processed);    }}Here is another example. Let's say we want to create markdown spell. First of all, we should add a dependency for markdown in package.json// package.json "dependencies": {    
 "markdown": "0.5.0",    "zeppelin-spell": "*"  },And here is spell code.import {    SpellBase,    SpellResult,    DefaultDisplayType,} from 'zeppelin-spell';import md from 'markdown';const markdown = md.markdown;export default class MarkdownSpell extends SpellBase {    constructor() {        super("%markdown");    }    interpret(paragraphText) {        const parsed = markdown.toHTML(paragraphText);        /**         * specify `DefaultDisplayType.HTML` since `parsed` will contain DOM         * otherwise it will be rendered as `DefaultDisplayType.TEXT` (default)         */        return new SpellResult(parsed, DefaultDisplayType.HTML);    }}You might want to manipulate DOM directly (e.g google d3.js), then refer Flowchart SpellYou might want to return promise not string (e.g API call), then refer Google Translation API Spell3. Create Helium package file for local 
 deploymentYou don't want to publish your package every time you make a change in your spell. Zeppelin provides local deploy.The only thing you need to do is creating a Helium Package file (JSON) for local deploy.It's automatically created when you publish to npm repository but in local case, you should make it by yourself.{  "type" : "SPELL",  "name" : "zeppelin-echo-spell",  "version": "1.0.0",  "description" : "Return just what receive (example)",  "artifact" : "./zeppelin-examples/zeppelin-example-spell-echo",  "license" : "Apache-2.0",  "spell": {    "magic": "%echo",    "usage": "%echo <TEXT>"  }}Place this file in your local registry directory (de
 fault $ZEPPELIN_HOME/helium).type should be SPELLMake sure that artifact should be same as your spell directory.You can get information about other fields in Helium Visualization#3-create-helium-package-file-and-locally-deploy.4. Run in dev modecd zeppelin-webyarn run dev:heliumYou can browse localhost:9000. Every time refresh your browser, Zeppelin will rebuild your spell and reload changes.5. Publish your spell to the npm repositorySee Publishing npm packages",
-      "url": " /development/helium/writing_spell.html",
-      "group": "development/helium",
-      "excerpt": "Spell is a kind of interpreter that runs on browser not on backend. So, technically it's the frontend interpreter. "
+    "/interpreter/mahout.html": {
+      "title": "Mahout Interpreter for Apache Zeppelin",
+      "content"  : "<!--Licensed under the Apache License, Version 2.0 (the "License");you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License.-->Apache Mahout Interpreter for Apache ZeppelinInstallationApache Mahout is a collection of packages that enable machine learning and matrix algebra on underlying engines such as Apache Flink or Apache Spark.  A convenience script for creating and configuring two Mahout enabled interpreters exists.  The %sparkMahout and %flinkMahout interpreters do not exist by default but can be easily created using this script.  Easy InstallationTo
  quickly and easily get up and running using Apache Mahout, run the following command from the top-level directory of the Zeppelin install:bashpython scripts/mahout/add_mahout.pyThis will create the %sparkMahout and %flinkMahout interpreters, and restart Zeppelin.Advanced InstallationThe add_mahout.py script contains several command line arguments for advanced users.      Argument    Description    Example        --zeppelin_home    This is the path to the Zeppelin installation.  This flag is not needed if the script is run from the top-level installation directory or from the `zeppelin/scripts/mahout` directory.    /path/to/zeppelin        --mahout_home    If the user has already installed Mahout, this flag can set the path to `MAHOUT_HOME`.  If this is set, downloading Mahout will be skipped.    /path/to/mahout_home        --restart_later    Restarting is necessary for updates to take effect. By default the script will restart Zeppelin for you- restart will be skipped if this flag 
 is set.    NA        --force_download    This flag will force the script to re-download the binary even if it already exists.  This is useful for previously failed downloads.    NA          --overwrite_existing      This flag will force the script to overwrite existing `%sparkMahout` and `%flinkMahout` interpreters. Useful when you want to just start over.      NA    NOTE 1: Apache Mahout at this time only supports Spark 1.5 and Spark 1.6 and Scala 2.10.  If the user is using another version of Spark (e.g. 2.0), the %sparkMahout will likely not work.  The %flinkMahout interpreter will still work and the user is encouraged to develop with that engine as the code can be ported via copy and paste, as is evidenced by the tutorial notebook.NOTE 2: If using Apache Flink in cluster mode, the following libraries will also need to be coppied to ${FLINK_HOME}/lib- mahout-math-0.12.2.jar- mahout-math-scala2.10-0.12.2.jar- mahout-flink2.10-0.12.2.jar- mahout-hdfs-0.12.2.jar- com.google.guava:gu
 ava:14.0.1OverviewThe Apache Mahout™ project's goal is to build an environment for quickly creating scalable performant machine learning applications.Apache Mahout software provides three major features:A simple and extensible programming environment and framework for building scalable algorithmsA wide variety of premade algorithms for Scala + Apache Spark, H2O, Apache FlinkSamsara, a vector math experimentation environment with R-like syntax which works at scaleIn other words:Apache Mahout provides a unified API for quickly creating machine learning algorithms on a variety of engines.How to useWhen starting a session with Apache Mahout, depending on which engine you are using (Spark or Flink), a few imports must be made and a Distributed Context must be declared.  Copy and paste the following code and run once to get started.Flink%flinkMahoutimport org.apache.flink.api.scala._import org.apache.mahout.math.drm._import org.apache.mahout.math.drm.RLikeDrmOps._import org.a
 pache.mahout.flinkbindings._import org.apache.mahout.math._import scalabindings._import RLikeOps._implicit val ctx = new FlinkDistributedContext(benv)Spark%sparkMahoutimport org.apache.mahout.math._import org.apache.mahout.math.scalabindings._import org.apache.mahout.math.drm._import org.apache.mahout.math.scalabindings.RLikeOps._import org.apache.mahout.math.drm.RLikeDrmOps._import org.apache.mahout.sparkbindings._implicit val sdc: org.apache.mahout.sparkbindings.SparkDistributedContext = sc2sdc(sc)Same Code, Different EnginesAfter importing and setting up the distributed context, the Mahout R-Like DSL is consistent across engines.  The following code will run in both %flinkMahout and %sparkMahoutval drmData = drmParallelize(dense(  (2, 2, 10.5, 10, 29.509541),  // Apple Cinnamon Cheerios  (1, 2, 12,   12, 18.042851),  // Cap'n'Crunch  (1, 1, 12,   13, 22.736446),  // Cocoa Puffs  (2, 1, 11,   13, 32.207582),  // Froot Loops  (1, 2, 12,   11, 21.871292),  // Honey G
 raham Ohs  (2, 1, 16,   8,  36.187559),  // Wheaties Honey Gold  (6, 2, 17,   1,  50.764999),  // Cheerios  (3, 2, 13,   7,  40.400208),  // Clusters  (3, 3, 13,   4,  45.811716)), numPartitions = 2)drmData.collect(::, 0 until 4)val drmX = drmData(::, 0 until 4)val y = drmData.collect(::, 4)val drmXtX = drmX.t %*% drmXval drmXty = drmX.t %*% yval XtX = drmXtX.collectval Xty = drmXty.collect(::, 0)val beta = solve(XtX, Xty)Leveraging Resource Pools and R for VisualizationResource Pools are a powerful Zeppelin feature that lets us share information between interpreters. A fun trick is to take the output of our work in Mahout and analyze it in other languages.Setting up a Resource Pool in FlinkIn Spark based interpreters resource pools are accessed via the ZeppelinContext API.  To put and get things from the resource pool one can be done simplescalaval myVal = 1z.put("foo", myVal)val myFetchedVal = z.get("foo")To add this functionality to a Flink bas
 ed interpreter we declare the follwoing%flinkMahoutimport org.apache.zeppelin.interpreter.InterpreterContextval z = InterpreterContext.get().getResourcePool()Now we can access the resource pool in a consistent manner from the %flinkMahout interpreter.Passing a variable from Mahout to R and PlottingIn this simple example, we use Mahout (on Flink or Spark, the code is the same) to create a random matrix and then take the Sin of each element. We then randomly sample the matrix and create a tab separated string. Finally we pass that string to R where it is read as a .tsv file, and a DataFrame is created and plotted using native R plotting libraries.val mxRnd = Matrices.symmetricUniformView(5000, 2, 1234)val drmRand = drmParallelize(mxRnd)val drmSin = drmRand.mapBlock() {case (keys, block) =>    val blockB = block.like()  for (i <- 0 until block.nrow) {    blockB(i, 0) = block(i, 0)    blockB(i, 1) = Math.sin((block(i, 0) * 8))  }  keys -> blockB}z.put("sinD
 rm", org.apache.mahout.math.drm.drmSampleToTSV(drmSin, 0.85))And then in an R paragraph...%spark.r {"imageWidth": "400px"}library("ggplot2")sinStr = z.get("flinkSinDrm")data <- read.table(text= sinStr, sep="t", header=FALSE)plot(data,  col="red")",
+      "url": " /interpreter/mahout.html",
+      "group": "interpreter",
+      "excerpt": "Apache Mahout provides a unified API (the R-Like Scala DSL) for quickly creating machine learning algorithms on a variety of engines."
     }
     ,
     
   
 
-    "/development/helium/writing_visualization_basic.html": {
-      "title": "Writing a new Helium Visualization: basic",
-      "content"  : "<!--Licensed under the Apache License, Version 2.0 (the "License");you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License.-->Writing a new VisualizationWhat is Apache Zeppelin VisualizationApache Zeppelin Visualization is a pluggable package that can be loaded/unloaded on runtime through Helium framework in Zeppelin. A Visualization is a javascript npm package and user can use them just like any other built-in visualization in notebook.How it works1. Load Helium package files from registryZeppelin needs to know what Visualization packages are available. 
 Zeppelin will read information of packages from both online and local registry.Registries are configurable through ZEPPELIN_HELIUM_LOCALREGISTRY_DEFAULT env variable or zeppelin.helium.localregistry.default property.2. Enable packagesOnce Zeppelin loads Helium package files from registries, available packages are displayed in Helium menu.Click 'enable' button.3. Create and load visualization bundle on the flyOnce a Visualization package is enabled, HeliumBundleFactory creates a js bundle. The js bundle is served by helium/bundle/load rest api endpoint.4. Run visualizationZeppelin shows additional button for loaded Visualizations.User can use just like any other built-in visualizations.Write new Visualization1. Create a npm packageCreate a package.json in your new Visualization directory. You can add any dependencies in package.json, but you must include two dependencies: zeppelin-vis and zeppelin-tabledata.Here's an example{  "name": &qu
 ot;zeppelin_horizontalbar",  "description" : "Horizontal Bar chart",  "version": "1.0.0",  "main": "horizontalbar",  "author": "",  "license": "Apache-2.0",  "dependencies": {    "zeppelin-tabledata": "*",    "zeppelin-vis": "*"  }}2. Create your own visualizationTo create your own visualization, you need to create a js file and import Visualization class from zeppelin-vis package and extend the class. zeppelin-tabledata package provides some useful transformations, like pivot, you can use in your visualization. (you can create your own transformation, too).Visualization class, there're several methods that you need to override and implement. Here's simple visualization that just prints Hello 
 world.import Visualization from 'zeppelin-vis'import PassthroughTransformation from 'zeppelin-tabledata/passthrough'export default class helloworld extends Visualization {  constructor(targetEl, config) {    super(targetEl, config)    this.passthrough = new PassthroughTransformation(config);  }  render(tableData) {    this.targetEl.html('Hello world!')  }  getTransformation() {    return this.passthrough  }}To learn more about Visualization class, check visualization.js.You can check complete visualization package example here.Zeppelin's built-in visualization uses the same API, so you can check built-in visualizations as additional examples.3. Create Helium package file and locally deployHelium Package file is a json file that provides information about the application.Json file contains the following information{  "type" : "VISUALIZATION",  "name" : "zeppelin_hori
 zontalbar",  "description" : "Horizontal Bar chart (example)",  "license" : "Apache-2.0",  "artifact" : "./zeppelin-examples/zeppelin-example-horizontalbar",  "icon" : "<i class='fa fa-bar-chart rotate90flipX'></i>"}Place this file in your local registry directory (default ./helium).typeWhen you're creating a visualization, 'type' should be 'VISUALIZATION'. Check these types as well.Helium ApplicationHelium SpellnameName of visualization. Should be unique. Allows [A-Za-z90-9_].descriptionA short description about visualization.artifactLocation of the visualization npm package. Support npm package with version or local filesystem path.e.g.When artifact exists in npm repository"artifact": "my-visualiztion@1.0.0"When 
 artifact exists in local file system"artifact": "/path/to/my/visualization"licenseLicense information.e.g."license": "Apache-2.0"iconIcon to be used in visualization select button. String in this field will be rendered as a HTML tag.e.g."icon": "<i class='fa fa-coffee'></i>"4. Run in dev modePlace your Helium package file in local registry (ZEPPELIN_HOME/helium).Run Zeppelin. And then run zeppelin-web in visualization dev mode.cd zeppelin-webyarn run dev:heliumYou can browse localhost:9000. Everytime refresh your browser, Zeppelin will rebuild your visualization and reload changes.5. Publish your visualizationOnce it's done, publish your visualization package using npm publish.That's it. With in an hour, your visualization will be available in Zeppelin's helium menu.See MoreCheck Helium Visualization: Transfor
 mation for more complex examples.",
-      "url": " /development/helium/writing_visualization_basic.html",
-      "group": "development/helium",
-      "excerpt": "Apache Zeppelin Visualization is a pluggable package that can be loaded/unloaded on runtime through Helium framework in Zeppelin. A Visualization is a javascript npm package and user can use them just like any other built-in visualization in a note."
+    "/interpreter/spark.html": {
+      "title": "Apache Spark Interpreter for Apache Zeppelin",
+      "content"  : "<!--Licensed under the Apache License, Version 2.0 (the "License");you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License.-->Spark Interpreter for Apache ZeppelinOverviewApache Spark is a fast and general-purpose cluster computing system.It provides high-level APIs in Java, Scala, Python and R, and an optimized engine that supports general execution graphs.Apache Spark is supported in Zeppelin with Spark interpreter group which consists of below five interpreters.      Name    Class    Description        %spark    SparkInterpreter    Creates a SparkConte
 xt and provides a Scala environment        %spark.pyspark    PySparkInterpreter    Provides a Python environment        %spark.r    SparkRInterpreter    Provides an R environment with SparkR support        %spark.sql    SparkSQLInterpreter    Provides a SQL environment        %spark.dep    DepInterpreter    Dependency loader  ConfigurationThe Spark interpreter can be configured with properties provided by Zeppelin.You can also set other Spark properties which are not listed in the table. For a list of additional properties, refer to Spark Available Properties.      Property    Default    Description        args        Spark commandline args      master    local[*]    Spark master uri.  ex) spark://masterhost:7077      spark.app.name    Zeppelin    The name of spark application.        spark.cores.max        Total number of cores to use.  Empty value uses all available core.        spark.executor.memory     1g    Executor memory per worker instance.  ex) 512m, 32g        zeppelin.dep
 .additionalRemoteRepository    spark-packages,  http://dl.bintray.com/spark-packages/maven,  false;    A list of id,remote-repository-URL,is-snapshot;  for each remote repository.        zeppelin.dep.localrepo    local-repo    Local repository for dependency loader        PYSPARKPYTHON    python    Python binary executable to use for PySpark in both driver and workers (default is python).            Property spark.pyspark.python take precedence if it is set        PYSPARKDRIVERPYTHON    python    Python binary executable to use for PySpark in driver only (default is PYSPARKPYTHON).            Property spark.pyspark.driver.python take precedence if it is set        zeppelin.spark.concurrentSQL    false    Execute multiple SQL concurrently if set true.        zeppelin.spark.maxResult    1000    Max number of Spark SQL result to display.        zeppelin.spark.printREPLOutput    true    Print REPL output        zeppelin.spark.useHiveContext    true    Use HiveContext instead of SQLConte
 xt if it is true.        zeppelin.spark.importImplicit    true    Import implicits, UDF collection, and sql if set true.        zeppelin.spark.enableSupportedVersionCheck    true    Do not change - developer only setting, not for production use      zeppelin.spark.uiWebUrl        Overrides Spark UI default URL. Value should be a full URL (ex: http://{hostName}/{uniquePath}  Without any configuration, Spark interpreter works out of box in local mode. But if you want to connect to your Spark cluster, you'll need to follow below two simple steps.1. Export SPARK_HOMEIn conf/zeppelin-env.sh, export SPARK_HOME environment variable with your Spark installation path.For example,export SPARK_HOME=/usr/lib/sparkYou can optionally set more environment variables# set hadoop conf direxport HADOOP_CONF_DIR=/usr/lib/hadoop# set options to pass spark-submit commandexport SPARK_SUBMIT_OPTIONS="--packages com.databricks:spark-csv_2.10:1.2.0"# extra classpath. e.g. set classp
 ath for hive-site.xmlexport ZEPPELIN_INTP_CLASSPATH_OVERRIDES=/etc/hive/confFor Windows, ensure you have winutils.exe in %HADOOP_HOME%bin. Please see Problems running Hadoop on Windows for the details.2. Set master in Interpreter menuAfter start Zeppelin, go to Interpreter menu and edit master property in your Spark interpreter setting. The value may vary depending on your Spark cluster deployment type.For example,local[*] in local modespark://master:7077 in standalone clusteryarn-client in Yarn client modeyarn-cluster in Yarn cluster modemesos://host:5050 in Mesos clusterThat's it. Zeppelin will work with any version of Spark and any deployment type without rebuilding Zeppelin in this way.For the further information about Spark & Zeppelin version compatibility, please refer to "Available Interpreters" section in Zeppelin download page.Note that without exporting SPARK_HOME, it's running in local mode with included version of Spark. The incl
 uded version may vary depending on the build profile.3. Yarn modeZeppelin support both yarn client and yarn cluster mode (yarn cluster mode is supported from 0.8.0). For yarn mode, you must specify SPARK_HOME & HADOOP_CONF_DIR.You can either specify them in zeppelin-env.sh, or in interpreter setting page. Specifying them in zeppelin-env.sh means you can use only one version of spark & hadoop. Specifying themin interpreter setting page means you can use multiple versions of spark & hadoop in one zeppelin instance.SparkContext, SQLContext, SparkSession, ZeppelinContextSparkContext, SQLContext and ZeppelinContext are automatically created and exposed as variable names sc, sqlContext and z, respectively, in Scala, Python and R environments.Staring from 0.6.1 SparkSession is available as variable spark when you are using Spark 2.x.Note that Scala/Python/R environment shares the same SparkContext, SQLContext and ZeppelinContext instance. How to pass property to Spa
 rkConfThere're 2 kinds of properties that would be passed to SparkConfStandard spark property (prefix with spark.). e.g. spark.executor.memory will be passed to SparkConfNon-standard spark property (prefix with zeppelin.spark.).  e.g. zeppelin.spark.property_1, property_1 will be passed to SparkConfDependency ManagementThere are two ways to load external libraries in Spark interpreter. First is using interpreter setting menu and second is loading Spark properties.1. Setting Dependencies via Interpreter SettingPlease see Dependency Management for the details.2. Loading Spark PropertiesOnce SPARK_HOME is set in conf/zeppelin-env.sh, Zeppelin uses spark-submit as spark interpreter runner. spark-submit supports two ways to load configurations.The first is command line options such as --master and Zeppelin can pass these options to spark-submit by exporting SPARK_SUBMIT_OPTIONS in conf/zeppelin-env.sh. Second is reading configuration options from SPARK_HOME/conf/spark-defaults.co
 nf. Spark properties that user can set to distribute libraries are:      spark-defaults.conf    SPARK_SUBMIT_OPTIONS    Description        spark.jars    --jars    Comma-separated list of local jars to include on the driver and executor classpaths.        spark.jars.packages    --packages    Comma-separated list of maven coordinates of jars to include on the driver and executor classpaths. Will search the local maven repo, then maven central and any additional remote repositories given by --repositories. The format for the coordinates should be groupId:artifactId:version.        spark.files    --files    Comma-separated list of files to be placed in the working directory of each executor.  Here are few examples:SPARK_SUBMIT_OPTIONS in conf/zeppelin-env.shexport SPARK_SUBMIT_OPTIONS="--packages com.databricks:spark-csv_2.10:1.2.0 --jars /path/mylib1.jar,/path/mylib2.jar --files /path/mylib1.py,/path/mylib2.zip,/path/mylib3.egg"SPARK_HOME/conf/spark-defaults.confspark
 .jars        /path/mylib1.jar,/path/mylib2.jarspark.jars.packages   com.databricks:spark-csv_2.10:1.2.0spark.files       /path/mylib1.py,/path/mylib2.egg,/path/mylib3.zip3. Dynamic Dependency Loading via %spark.dep interpreterNote: %spark.dep interpreter loads libraries to %spark and %spark.pyspark but not to  %spark.sql interpreter. So we recommend you to use the first option instead.When your code requires external library, instead of doing download/copy/restart Zeppelin, you can easily do following jobs using %spark.dep interpreter.Load libraries recursively from maven repositoryLoad libraries from local filesystemAdd additional maven repositoryAutomatically add libraries to SparkCluster (You can turn off)Dep interpreter leverages Scala environment. So you can write any Scala code here.Note that %spark.dep interpreter should be used before %spark, %spark.pyspark, %spark.sql.Here's usages.%spark.depz.reset() // clean up previously added artifact and repository// add maven 
 repositoryz.addRepo("RepoName").url("RepoURL")// add maven snapshot repositoryz.addRepo("RepoName").url("RepoURL").snapshot()// add credentials for private maven repositoryz.addRepo("RepoName").url("RepoURL").username("username").password("password")// add artifact from filesystemz.load("/path/to.jar")// add artifact from maven repository, with no dependencyz.load("groupId:artifactId:version").excludeAll()// add artifact recursivelyz.load("groupId:artifactId:version")// add artifact recursively except comma separated GroupID:ArtifactId listz.load("groupId:artifactId:version").exclude("groupId:artifactId,groupId:artifactId, ...")// exclude with patternz.load("groupId:artifactId:version").exclude(*)z.load("groupId:artifactId:ver
 sion").exclude("groupId:artifactId:*")z.load("groupId:artifactId:version").exclude("groupId:*")// local() skips adding artifact to spark clusters (skipping sc.addJar())z.load("groupId:artifactId:version").local()ZeppelinContextZeppelin automatically injects ZeppelinContext as variable z in your Scala/Python environment. ZeppelinContext provides some additional functions and utilities.Exploring Spark DataFramesZeppelinContext provides a show method, which, using Zeppelin's table feature, can be used to nicely display a Spark DataFrame:df = spark.read.csv('/path/to/csv')z.show(df)Object ExchangeZeppelinContext extends map and it's shared between Scala and Python environment.So you can put some objects from Scala and read it from Python, vice versa.  // Put object from scala%sparkval myObject = ...z.put("objName", myObject)// Exchanging data framesmyScalaDa
 taFrame = ...z.put("myScalaDataFrame", myScalaDataFrame)val myPythonDataFrame = z.get("myPythonDataFrame").asInstanceOf[DataFrame]    # Get object from python%spark.pysparkmyObject = z.get("objName")# Exchanging data framesmyPythonDataFrame = ...z.put("myPythonDataFrame", postsDf._jdf)myScalaDataFrame = DataFrame(z.get("myScalaDataFrame"), sqlContext)  Form CreationZeppelinContext provides functions for creating forms.In Scala and Python environments, you can create forms programmatically.  %spark/* Create text input form */z.input("formName")/* Create text input form with default value */z.input("formName", "defaultValue")/* Create select form */z.select("formName", Seq(("option1", "option1DisplayName"),                         ("option2", "option2DisplayName&a
 mp;quot;)))/* Create select form with default value*/z.select("formName", "option1", Seq(("option1", "option1DisplayName"),                                    ("option2", "option2DisplayName")))    %spark.pyspark# Create text input formz.input("formName")# Create text input form with default valuez.input("formName", "defaultValue")# Create select formz.select("formName", [("option1", "option1DisplayName"),                      ("option2", "option2DisplayName")])# Create select form with default valuez.select("formName", [("option1", "option1DisplayName"),                      ("option2", "option2DisplayName")], "option1")  In sql
  environment, you can create form in simple template.%spark.sqlselect * from ${table=defaultTableName} where text like '%${search}%'To learn more about dynamic form, checkout Dynamic Form.Matplotlib Integration (pyspark)Both the python and pyspark interpreters have built-in support for inline visualization using matplotlib,a popular plotting library for python. More details can be found in the python interpreter documentation,since matplotlib support is identical. More advanced interactive plotting can be done with pyspark throughutilizing Zeppelin's built-in Angular Display System, as shown below:Interpreter setting optionYou can choose one of shared, scoped and isolated options wheh you configure Spark interpreter.Spark interpreter creates separated Scala compiler per each notebook but share a single SparkContext in scoped mode (experimental).It creates separated SparkContext per each notebook in isolated mode.IPython supportBy default, zeppelin would use I
 Python in pyspark when IPython is available, Otherwise it would fall back to the original PySpark implementation.If you don't want to use IPython, then you can set zeppelin.pyspark.useIPython as false in interpreter setting. For the IPython features, you can refer docPython InterpreterSetting up Zeppelin with KerberosLogical setup with Zeppelin, Kerberos Key Distribution Center (KDC), and Spark on YARN:Configuration SetupOn the server that Zeppelin is installed, install Kerberos client modules and configuration, krb5.conf.This is to make the server communicate with KDC.Set SPARK_HOME in [ZEPPELIN_HOME]/conf/zeppelin-env.sh to use spark-submit(Additionally, you might have to set export HADOOP_CONF_DIR=/etc/hadoop/conf)Add the two properties below to Spark configuration ([SPARK_HOME]/conf/spark-defaults.conf):spark.yarn.principalspark.yarn.keytabNOTE: If you do not have permission to access for the above spark-defaults.conf file, optionally, you can add the above lines to the 
 Spark Interpreter setting through the Interpreter tab in the Zeppelin UI.That's it. Play with Zeppelin!",
+      "url": " /interpreter/spark.html",
+      "group": "interpreter",
+      "excerpt": "Apache Spark is a fast and general-purpose cluster computing system. It provides high-level APIs in Java, Scala, Python and R, and an optimized engine that supports general execution engine."
     }
     ,
     
   
 
-    "/development/helium/writing_visualization_transformation.html": {
-      "title": "Transformations in Zeppelin Visualization",
-      "content"  : "<!--Licensed under the Apache License, Version 2.0 (the "License");you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License.-->Transformations for Zeppelin VisualizationOverviewTransformations renders setting which allows users to set columns and transforms table rows according to the configured columns.Zeppelin provides 4 types of transformations.1. PassthroughTransformationPassthroughTransformation is the simple transformation which does not convert original tabledata at all.See passthrough.js2. ColumnselectorTransformationColumnselectorTransformation is
  uses when you need N axes but do not need aggregation. See columnselector.js3. PivotTransformationPivotTransformation provides group by and aggregation. Every chart using PivotTransformation has 3 axes. Keys, Groups and Values.See pivot.js4. AdvancedTransformationAdvancedTransformation has more detailed options while providing existing features of PivotTransformation and ColumnselectorTransformationmultiple sub chartsconfigurable chart axesparameter widgets: input, checkbox, option, textareaparsing parameters automatically based on their typesexpand / fold axis and parameter panelsmultiple transformation methods while supporting lazy converting re-initialize the whole configuration based on spec hash.SpecAdvancedTransformation requires spec which includes axis and parameter details for charts.Let's create 2 sub-charts called line and no-group. Each sub chart can have different axis and parameter depending on their requirements.class AwesomeVisualization extends Visualizatio
 n {  constructor(targetEl, config) {    super(targetEl, config)    const spec = {      charts: {        'line': {          transform: { method: 'object', },          sharedAxis: false, /** set if you want to share axes between sub charts, default is `false` */          axis: {            'xAxis': { dimension: 'multiple', axisType: 'key', description: 'serial', },            'yAxis': { dimension: 'multiple', axisType: 'aggregator', description: 'serial', },            'category': { dimension: 'multiple', axisType: 'group', description: 'categorical', },          },          parameter: {            'xAxisUnit': { valueType: 'string', defaultValue: '', description: 'unit of xAxis', },            &#3
 9;yAxisUnit': { valueType: 'string', defaultValue: '', description: 'unit of yAxis', },            'lineWidth': { valueType: 'int', defaultValue: 0, description: 'width of line', },          },        },        'no-group': {          transform: { method: 'object', },          sharedAxis: false,          axis: {            'xAxis': { dimension: 'single', axisType: 'key', },            'yAxis': { dimension: 'multiple', axisType: 'value', },          },          parameter: {            'xAxisUnit': { valueType: 'string', defaultValue: '', description: 'unit of xAxis', },            'yAxisUnit': { valueType: 'string', defaultValue: '', description: 
 'unit of yAxis', },        },      },    }    this.transformation = new AdvancedTransformation(config, spec)  }  ...  // `render` will be called whenever `axis` or `parameter` is changed   render(data) {    const { chart, parameter, column, transformer, } = data    if (chart === 'line') {      const transformed = transformer()      // draw line chart     } else if (chart === 'no-group') {      const transformed = transformer()      // draw no-group chart     }  }}Spec: axisField NameAvailable Values (type)DescriptiondimensionsingleAxis can contains only 1 columndimensionmultipleAxis can contains multiple columnsaxisTypekeyColumn(s) in this axis will be used as key like in PivotTransformation. These columns will be served in column.keyaxisTypeaggregatorColumn(s) in this axis will be used as value like in PivotTransformation. These columns will be served in column.aggregatoraxisTypegroupColumn(s) in this axis will be used as group like i
 n PivotTransformation. These columns will be served in column.groupaxisType(string)Any string value can be used here. These columns will be served in column.custommaxAxisCount (optional)(int)The max number of columns that this axis can contain. (unlimited if undefined)minAxisCount (optional)(int)The min number of columns that this axis should contain to draw chart. (1 in case of single dimension)description (optional)(string)Description for the axis.Here is an example.axis: {  'xAxis': { dimension: 'multiple', axisType: 'key',  },  'yAxis': { dimension: 'multiple', axisType: 'aggregator'},  'category': { dimension: 'multiple', axisType: 'group', maxAxisCount: 2, valueType: 'string', },},Spec: sharedAxisIf you set sharedAxis: false for sub charts, then their axes are persisted in global space (shared). It's useful for 
 when you creating multiple sub charts sharing their axes but have different parameters. For example, basic-column, stacked-column, percent-columnpie and donutHere is an example.    const spec = {      charts: {        'column': {          transform: { method: 'array', },          sharedAxis: true,          axis: { ... },          parameter: { ... },        },        'stacked': {          transform: { method: 'array', },          sharedAxis: true,          axis: { ... }          parameter: { ... },        },Spec: parameterField NameAvailable Values (type)DescriptionvalueTypestringParameter which has string valuevalueTypeintParameter which has int valuevalueTypefloatParameter which has float valuevalueTypebooleanParameter which has boolean value used with checkbox widget usuallyvalueTypeJSONParameter which has JSON value used with textarea widget usually. defaultValue should be "" (empty string). Thisdes
 cription(string)Description of this parameter. This value will be parsed as HTML for pretty outputwidgetinputUse input widget. This is the default widget (if widget is undefined)widgetcheckboxUse checkbox widget.widgettextareaUse textarea widget.widgetoptionUse select + option widget. This parameter should have optionValues field as well.optionValues(Array)Available option values used with the option widgetHere is an example.parameter: {  // string type, input widget  'xAxisUnit': { valueType: 'string', defaultValue: '', description: 'unit of xAxis', },  // boolean type, checkbox widget  'inverted': { widget: 'checkbox', valueType: 'boolean', defaultValue: false, description: 'invert x and y axes', },  // string type, option widget with `optionValues`  'graphType': { widget: 'option', valueType: 'string', defa
 ultValue: 'line', description: 'graph type', optionValues: [ 'line', 'smoothedLine', 'step', ], },  // HTML in `description`  'dateFormat': { valueType: 'string', defaultValue: '', description: 'format of date (<a href="https://docs.amcharts.com/3/javascriptcharts/AmGraph#dateFormat">doc</a>) (e.g YYYY-MM-DD)', },  // JSON type, textarea widget  'yAxisGuides': { widget: 'textarea', valueType: 'JSON', defaultValue: '', description: 'guides of yAxis ', },Spec: transformField NameAvailable Values (type)Descriptionmethodobjectdesigned for rows requiring object manipulationmethodarraydesigned for rows requiring array manipulationmethodarray:2-keydesigned for xyz charts (e.g bubble chart)methoddrill-downdesigned for drill-d
 own chartsmethodrawwill return the original tableData.rowsWhatever you specified as transform.method, the transformer value will be always function for lazy computation. // advanced-transformation.util#getTransformerif (transformSpec.method === 'raw') {  transformer = () => { return rows; }} else if (transformSpec.method === 'array') {  transformer = () => {    ...    return { ... }  }}Here is actual usage.class AwesomeVisualization extends Visualization {  constructor(...) { /** setup your spec */ }  ...   // `render` will be called whenever `axis` or `parameter` are changed  render(data) {    const { chart, parameter, column, transformer, } = data    if (chart === 'line') {      const transformed = transformer()      // draw line chart     } else if (chart === 'no-group') {      const transformed = transformer()      // draw no-group chart     }  }  ...}",
-      "url": " /development/helium/writing_visualization_transformation.html",
-      "group": "development/helium",
-      "excerpt": "Description for Transformations"
+    "/interpreter/python.html": {
+      "title": "Python 2 & 3 Interpreter for Apache Zeppelin",
+      "content"  : "<!--Licensed under the Apache License, Version 2.0 (the "License");you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License.-->Python 2 & 3 Interpreter for Apache ZeppelinConfiguration      Property    Default    Description        zeppelin.python    python    Path of the already installed Python binary (could be python2 or python3).    If python is not in your $PATH you can set the absolute directory (example : /usr/bin/python)            zeppelin.python.maxResult    1000    Max number of dataframe rows to display.  Enabling Python InterpreterIn a
  notebook, to enable the Python interpreter, click on the Gear icon and select PythonUsing the Python InterpreterIn a paragraph, use %python to select the Python interpreter and then input all commands.The interpreter can only work if you already have python installed (the interpreter doesn't bring it own python binaries).To access the help, type help()Python environmentsDefaultBy default, PythonInterpreter will use python command defined in zeppelin.python property to run python process.The interpreter can use all modules already installed (with pip, easy_install...)CondaConda is an package management system and environment management system for python.%python.conda interpreter lets you change between environments.Usageget the Conda Infomation: %python.conda infolist the Conda environments: %python.conda env listcreate a conda enviornment: %python.conda create --name [ENV NAME]activate an environment (python interpreter will be restarted): %python.conda activate [ENV NAME]d
 eactivate%python.conda deactivateget installed package list inside the current environment%python.conda listinstall package%python.conda install [PACKAGE NAME]uninstall package%python.conda uninstall [PACKAGE NAME]Docker%python.docker interpreter allows PythonInterpreter creates python process in a specified docker container.Usageactivate an environment%python.docker activate [Repository]%python.docker activate [Repository:Tag]%python.docker activate [Image Id]deactivate%python.docker deactivateHere is an example# activate latest tensorflow image as a python environment%python.docker activate gcr.io/tensorflow/tensorflow:latestUsing Zeppelin Dynamic FormsYou can leverage Zeppelin Dynamic Form inside your Python code.Zeppelin Dynamic Form can only be used if py4j Python library is installed in your system. If not, you can install it with pip install py4j.Example : %python### Input formprint (z.input("f1","defaultValue"))### Select formprint (z.sele
 ct("f1",[("o1","1"),("o2","2")],"2"))### Checkbox formprint("".join(z.checkbox("f3", [("o1","1"), ("o2","2")],["1"])))Matplotlib integrationThe python interpreter can display matplotlib figures inline automatically using the pyplot module:%pythonimport matplotlib.pyplot as pltplt.plot([1, 2, 3])This is the recommended method for using matplotlib from within a Zeppelin notebook. The output of this command will by default be converted to HTML by implicitly making use of the %html magic. Additional configuration can be achieved using the builtin z.configure_mpl() method. For example, z.configure_mpl(width=400, height=300, fmt='svg')plt.plot([1, 2, 3])Will produce a 400x300 image in SVG format, which by default are normally 600x400 and PNG r
 espectively. In the future, another option called angular can be used to make it possible to update a plot produced from one paragraph directly from another (the output will be %angular instead of %html). However, this feature is already available in the pyspark interpreter. More details can be found in the included "Zeppelin Tutorial: Python - matplotlib basic" tutorial notebook. If Zeppelin cannot find the matplotlib backend files (which should usually be found in $ZEPPELIN_HOME/interpreter/lib/python) in your PYTHONPATH, then the backend will automatically be set to agg, and the (otherwise deprecated) instructions below can be used for more limited inline plotting.If you are unable to load the inline backend, use z.show(plt): python%pythonimport matplotlib.pyplot as pltplt.figure()(.. ..)z.show(plt)plt.close()The z.show() function can take optional parameters to adapt graph dimensions (width and height) as well as output format (png or optionally svg).%pythonz.s
 how(plt, width='50px')z.show(plt, height='150px', fmt='svg')Pandas integrationApache Zeppelin Table Display System provides built-in data visualization capabilities. Python interpreter leverages it to visualize Pandas DataFrames though similar z.show() API, same as with Matplotlib integration.Example:import pandas as pdrates = pd.read_csv("bank.csv", sep=";")z.show(rates)SQL over Pandas DataFramesThere is a convenience %python.sql interpreter that matches Apache Spark experience in Zeppelin and enables usage of SQL language to query Pandas DataFrames and visualization of results though built-in Table Display System.Pre-requestsPandas pip install pandasPandaSQL pip install -U pandasqlIn case default binded interpreter is Python (first in the interpreter list, under the Gear Icon), you can just use it as %sql i.efirst paragraphimport pandas as pdrates = pd.read_csv("bank.csv", sep=&am
 p;quot;;")next paragraph%sqlSELECT * FROM rates WHERE age < 40Otherwise it can be referred to as %python.sqlIPython SupportIPython is more powerful than the default python interpreter with extra functionality. You can use IPython with Python2 or Python3 which depends on which python you set zeppelin.python.Pre-requests- Jupyter `pip install jupyter`- grpcio `pip install grpcio`If you already install anaconda, then you just need to install grpcio as Jupyter is already included in anaconda.In addition to all basic functions of the python interpreter, you can use all the IPython advanced features as you use it in Jupyter Notebook.e.g. Use IPython magic%python.ipython#python helprange?#timeit%timeit range(100)Use matplotlib %python.ipython%matplotlib inlineimport matplotlib.pyplot as pltprint("hello world")data=[1,2,3,4]plt.figure()plt.plot(data)We also make ZeppelinContext available in IPython Interpreter. You can use ZeppelinContext to create dynamic 
 forms and display pandas DataFrame.e.g.Create dynamic formz.input(name='my_name', defaultValue='hello')Show pandas dataframeimport pandas as pddf = pd.DataFrame({'id':[1,2,3], 'name':['a','b','c']})z.show(df)By default, we would use IPython in %python.python if IPython is available. Otherwise it would fall back to the original Python implementation.If you don't want to use IPython, then you can set zeppelin.python.useIPython as false in interpreter setting.Technical descriptionFor in-depth technical details on current implementation please refer to python/README.md.Some features not yet implemented in the Python InterpreterInterrupt a paragraph execution (cancel() method) is currently only supported in Linux and MacOs. If interpreter runs in another operating system (for instance MS Windows) , interrupt a paragraph will close the whole interpreter. A JIRA ticket (Z
 EPPELIN-893) is opened to implement this feature in a next release of the interpreter.Progression bar in webUI  (getProgress() method) is currently not implemented.Code-completion is currently not implemented.",
+      "url": " /interpreter/python.html",
+      "group": "interpreter",
+      "excerpt": "Python is a programming language that lets you work quickly and integrate systems more effectively."
     }
     ,
     
   
 
-    "/development/writing_zeppelin_interpreter.html": {
-      "title": "Writing a New Interpreter",

[... 1111 lines stripped ...]