You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@zeppelin.apache.org by zj...@apache.org on 2019/10/30 13:45:27 UTC
svn commit: r1869172 [21/37] - in /zeppelin/site/docs/0.9.0-SNAPSHOT: ./ assets/themes/zeppelin/img/screenshots/ development/ development/contribution/ development/helium/ interpreter/ quickstart/ setup/basics/ setup/deployment/ setup/operation/ setup/...

Modified: zeppelin/site/docs/0.9.0-SNAPSHOT/search_data.json
URL: http://svn.apache.org/viewvc/zeppelin/site/docs/0.9.0-SNAPSHOT/search_data.json?rev=1869172&r1=1869171&r2=1869172&view=diff
==============================================================================
--- zeppelin/site/docs/0.9.0-SNAPSHOT/search_data.json (original)
+++ zeppelin/site/docs/0.9.0-SNAPSHOT/search_data.json Wed Oct 30 13:45:26 2019
@@ -3,7 +3,7 @@
 
     "/interpreter/livy.html": {
       "title": "Livy Interpreter for Apache Zeppelin",
-      "content"  : "&lt;!--Licensed under the Apache License, Version 2.0 (the &quot;License&quot;);you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an &quot;AS IS&quot; BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License.--&gt;Livy Interpreter for Apache ZeppelinOverviewLivy is an open source REST interface for interacting with Spark from anywhere. It supports executing snippets of code or programs in a Spark context that runs locally or in YARN.Interactive Scala, Python and R shellsBatch submissions in Scala, Java, PythonMulti users can share the same server (impersonation support)Can be used for submitting jobs from anywhere with RESTDoes not require a
 ny code change to your programsRequirementsAdditional requirements for the Livy interpreter are:Spark 1.3 or above.Livy server.ConfigurationWe added some common configurations for spark, and you can set any configuration you want.You can find all Spark configurations in here.And instead of starting property with spark. it should be replaced with livy.spark..Example: spark.driver.memory to livy.spark.driver.memory      Property    Default    Description        zeppelin.livy.url    http://localhost:8998    URL where livy server is running        zeppelin.livy.spark.sql.maxResult    1000    Max number of Spark SQL result to display.        zeppelin.livy.spark.sql.field.truncate    true    Whether to truncate field values longer than 20 characters or not        zeppelin.livy.session.create_timeout    120    Timeout in seconds for session creation        zeppelin.livy.displayAppInfo    true    Whether to display app info        zeppelin.livy.pull_status.interval.millis    1000    The int
 erval for checking paragraph execution status        livy.spark.driver.cores        Driver cores. ex) 1, 2.          livy.spark.driver.memory        Driver memory. ex) 512m, 32g.          livy.spark.executor.instances        Executor instances. ex) 1, 4.          livy.spark.executor.cores        Num cores per executor. ex) 1, 4.        livy.spark.executor.memory        Executor memory per worker instance. ex) 512m, 32g.        livy.spark.dynamicAllocation.enabled        Use dynamic resource allocation. ex) True, False.        livy.spark.dynamicAllocation.cachedExecutorIdleTimeout        Remove an executor which has cached data blocks.        livy.spark.dynamicAllocation.minExecutors        Lower bound for the number of executors.        livy.spark.dynamicAllocation.initialExecutors        Initial number of executors to run.        livy.spark.dynamicAllocation.maxExecutors        Upper bound for the number of executors.            livy.spark.jars.packages            Adding extra libr
 aries to livy interpreter          zeppelin.livy.ssl.trustStore        client trustStore file. Used when livy ssl is enabled        zeppelin.livy.ssl.trustStorePassword        password for trustStore file. Used when livy ssl is enabled        zeppelin.livy.http.headers    key_1: value_1; key_2: value_2    custom http headers when calling livy rest api. Each http header is separated by `;`, and each header is one key value pair where key value is separated by `:`  We remove livy.spark.master in zeppelin-0.7. Because we sugguest user to use livy 0.3 in zeppelin-0.7. And livy 0.3 don&amp;#39;t allow to specify livy.spark.master, it enfornce yarn-cluster mode.Adding External librariesYou can load dynamic library to livy interpreter by set livy.spark.jars.packages property to comma-separated list of maven coordinates of jars to include on the driver and executor classpaths. The format for the coordinates should be groupId:artifactId:version.Example      Property    Example    Description
           livy.spark.jars.packages      io.spray:spray-json_2.10:1.3.1      Adding extra libraries to livy interpreter      How to useBasically, you can usespark%livy.sparksc.versionpyspark%livy.pysparkprint &amp;quot;1&amp;quot;sparkR%livy.sparkrhello &amp;lt;- function( name ) {    sprintf( &amp;quot;Hello, %s&amp;quot;, name );}hello(&amp;quot;livy&amp;quot;)ImpersonationWhen Zeppelin server is running with authentication enabled,then this interpreter utilizes Livyâs user impersonation featurei.e. sends extra parameter for creating and running a session (&amp;quot;proxyUser&amp;quot;: &amp;quot;${loggedInUser}&amp;quot;).This is particularly useful when multi users are sharing a Notebook server.Apply Zeppelin Dynamic FormsYou can leverage Zeppelin Dynamic Form. Form templates is only avalible for livy sql interpreter.%livy.sqlselect * from products where ${product_id=1}And creating dynamic formst programmatically is not feasible in livy interpreter, because ZeppelinContext i
 s not available in livy interpreter.Shared SparkContextStarting from livy 0.5 which is supported by Zeppelin 0.8.0, SparkContext is shared between scala, python, r and sql.That means you can query the table via %livy.sql when this table is registered in %livy.spark, %livy.pyspark, $livy.sparkr.FAQLivy debugging: If you see any of these in error consoleConnect to livyhost:8998 [livyhost/127.0.0.1, livyhost/0:0:0:0:0:0:0:1] failed: Connection refusedLooks like the livy server is not up yet or the config is wrongException: Session not found, Livy server would have restarted, or lost session.The session would have timed out, you may need to restart the interpreter.Blacklisted configuration values in session config: spark.masterEdit conf/spark-blacklist.conf file in livy server and comment out #spark.master line.If you choose to work on livy in apps/spark/java directory in https://github.com/cloudera/hue,copy spark-user-configurable-options.template to spark-user-configurable-options.con
 f file in livy server and comment out #spark.master.",
+      "content"  : "&lt;!--Licensed under the Apache License, Version 2.0 (the &quot;License&quot;);you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an &quot;AS IS&quot; BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License.--&gt;Livy Interpreter for Apache ZeppelinOverviewLivy is an open source REST interface for interacting with Spark from anywhere. It supports executing snippets of code or programs in a Spark context that runs locally or in YARN.Interactive Scala, Python and R shellsBatch submissions in Scala, Java, PythonMulti users can share the same server (impersonation support)Can be used for submitting jobs from anywhere with RESTDoes not require a
 ny code change to your programsRequirementsAdditional requirements for the Livy interpreter are:Spark 1.3 or above.Livy server.ConfigurationWe added some common configurations for spark, and you can set any configuration you want.You can find all Spark configurations in here.And instead of starting property with spark. it should be replaced with livy.spark..Example: spark.driver.memory to livy.spark.driver.memory      Property    Default    Description        zeppelin.livy.url    http://localhost:8998    URL where livy server is running        zeppelin.livy.spark.sql.maxResult    1000    Max number of Spark SQL result to display.        zeppelin.livy.spark.sql.field.truncate    true    Whether to truncate field values longer than 20 characters or not        zeppelin.livy.session.create_timeout    120    Timeout in seconds for session creation        zeppelin.livy.displayAppInfo    true    Whether to display app info        zeppelin.livy.pull_status.interval.millis    1000    The int
 erval for checking paragraph execution status        livy.spark.driver.cores        Driver cores. ex) 1, 2.          livy.spark.driver.memory        Driver memory. ex) 512m, 32g.          livy.spark.executor.instances        Executor instances. ex) 1, 4.          livy.spark.executor.cores        Num cores per executor. ex) 1, 4.        livy.spark.executor.memory        Executor memory per worker instance. ex) 512m, 32g.        livy.spark.dynamicAllocation.enabled        Use dynamic resource allocation. ex) True, False.        livy.spark.dynamicAllocation.cachedExecutorIdleTimeout        Remove an executor which has cached data blocks.        livy.spark.dynamicAllocation.minExecutors        Lower bound for the number of executors.        livy.spark.dynamicAllocation.initialExecutors        Initial number of executors to run.        livy.spark.dynamicAllocation.maxExecutors        Upper bound for the number of executors.            livy.spark.jars.packages            Adding extra libr
 aries to livy interpreter          zeppelin.livy.ssl.trustStore        client trustStore file. Used when livy ssl is enabled        zeppelin.livy.ssl.trustStorePassword        password for trustStore file. Used when livy ssl is enabled        zeppelin.livy.ssl.trustStoreType    JKS    type of truststore. Either JKS or PKCS12.        zeppelin.livy.ssl.keyStore        client keyStore file. Needed if Livy requires two way SSL authentication.        zeppelin.livy.ssl.keyStorePassword        password for keyStore file.        zeppelin.livy.ssl.keyStoreType    JKS    type of keystore. Either JKS or PKCS12.        zeppelin.livy.ssl.keyPassword        password for key in the keyStore file. Defaults to zeppelin.livy.ssl.keyStorePassword.               zeppelin.livy.http.headers    key_1: value_1; key_2: value_2    custom http headers when calling livy rest api. Each http header is separated by `;`, and each header is one key value pair where key value is separated by `:`  We remove livy.spar
 k.master in zeppelin-0.7. Because we sugguest user to use livy 0.3 in zeppelin-0.7. And livy 0.3 don&amp;#39;t allow to specify livy.spark.master, it enfornce yarn-cluster mode.Adding External librariesYou can load dynamic library to livy interpreter by set livy.spark.jars.packages property to comma-separated list of maven coordinates of jars to include on the driver and executor classpaths. The format for the coordinates should be groupId:artifactId:version.Example      Property    Example    Description          livy.spark.jars.packages      io.spray:spray-json_2.10:1.3.1      Adding extra libraries to livy interpreter      How to useBasically, you can usespark%livy.sparksc.versionpyspark%livy.pysparkprint &amp;quot;1&amp;quot;sparkR%livy.sparkrhello &amp;lt;- function( name ) {    sprintf( &amp;quot;Hello, %s&amp;quot;, name );}hello(&amp;quot;livy&amp;quot;)ImpersonationWhen Zeppelin server is running with authentication enabled,then this interpreter utilizes Livyâs user im
 personation featurei.e. sends extra parameter for creating and running a session (&amp;quot;proxyUser&amp;quot;: &amp;quot;${loggedInUser}&amp;quot;).This is particularly useful when multi users are sharing a Notebook server.Apply Zeppelin Dynamic FormsYou can leverage Zeppelin Dynamic Form. Form templates is only avalible for livy sql interpreter.%livy.sqlselect * from products where ${product_id=1}And creating dynamic formst programmatically is not feasible in livy interpreter, because ZeppelinContext is not available in livy interpreter.Shared SparkContextStarting from livy 0.5 which is supported by Zeppelin 0.8.0, SparkContext is shared between scala, python, r and sql.That means you can query the table via %livy.sql when this table is registered in %livy.spark, %livy.pyspark, $livy.sparkr.FAQLivy debugging: If you see any of these in error consoleConnect to livyhost:8998 [livyhost/127.0.0.1, livyhost/0:0:0:0:0:0:0:1] failed: Connection refusedLooks like the livy server is not u
 p yet or the config is wrongException: Session not found, Livy server would have restarted, or lost session.The session would have timed out, you may need to restart the interpreter.Blacklisted configuration values in session config: spark.masterEdit conf/spark-blacklist.conf file in livy server and comment out #spark.master line.If you choose to work on livy in apps/spark/java directory in https://github.com/cloudera/hue,copy spark-user-configurable-options.template to spark-user-configurable-options.conf file in livy server and comment out #spark.master.",
       "url": " /interpreter/livy.html",
       "group": "interpreter",
       "excerpt": "Livy is an open source REST interface for interacting with Spark from anywhere. It supports executing snippets of code or programs in a Spark context that runs locally or in YARN."
@@ -25,7 +25,7 @@
 
     "/interpreter/markdown.html": {
       "title": "Markdown Interpreter for Apache Zeppelin",
-      "content"  : "&lt;!--Licensed under the Apache License, Version 2.0 (the &quot;License&quot;);you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an &quot;AS IS&quot; BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License.--&gt;Markdown Interpreter for Apache ZeppelinOverviewMarkdown is a plain text formatting syntax designed so that it can be converted to HTML.Apache Zeppelin uses pegdown and markdown4j as markdown parsers.In Zeppelin notebook, you can use %md in the beginning of a paragraph to invoke the Markdown interpreter and generate static html from Markdown plain text.In Zeppelin, Markdown interpreter is enabled by default and uses the pegdown par
 ser.ExampleThe following example demonstrates the basic usage of Markdown in a Zeppelin notebook.Mathematical expressionMarkdown interpreter leverages %html display system internally. That means you can mix mathematical expressions with markdown syntax. For more information, please see Mathematical Expression section.Configuration      Name    Default Value    Description        markdown.parser.type    pegdown    Markdown Parser Type.  Available values: pegdown, markdown4j.  Pegdown Parserpegdown parser provides github flavored markdown.pegdown parser provides YUML and Websequence plugins also. Markdown4j ParserSince pegdown parser is more accurate and provides much more markdown syntax markdown4j option might be removed later. But keep this parser for the backward compatibility.",
+      "content"  : "&lt;!--Licensed under the Apache License, Version 2.0 (the &quot;License&quot;);you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an &quot;AS IS&quot; BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License.--&gt;Markdown Interpreter for Apache ZeppelinOverviewMarkdown is a plain text formatting syntax designed so that it can be converted to HTML.Apache Zeppelin uses flexmark, pegdown and markdown4j as markdown parsers.In Zeppelin notebook, you can use %md in the beginning of a paragraph to invoke the Markdown interpreter and generate static html from Markdown plain text.In Zeppelin, Markdown interpreter is enabled by default and uses the p
 egdown parser.ExampleThe following example demonstrates the basic usage of Markdown in a Zeppelin notebook.Mathematical expressionMarkdown interpreter leverages %html display system internally. That means you can mix mathematical expressions with markdown syntax. For more information, please see Mathematical Expression section.Configuration      Name    Default Value    Description        markdown.parser.type    flexmark    Markdown Parser Type.  Available values: flexmark, pegdown, markdown4j.  Flexmark parser (Default Markdown Parser)CommonMark/Markdown Java parser with source level AST.flexmark parser provides YUML and Websequence extensions also.Pegdown Parserpegdown parser provides github flavored markdown. Although still one of the most popular Markdown parsing libraries for the JVM, pegdown has reached its end of life.The project is essentially unmaintained with tickets piling up and crucial bugs not being fixed.pegdown&amp;#39;s parsing performance isn&amp;#39;t great. But k
 eep this parser for the backward compatibility.Markdown4j ParserSince pegdown parser is more accurate and provides much more markdown syntax markdown4j option might be removed later. But keep this parser for the backward compatibility.",
       "url": " /interpreter/markdown.html",
       "group": "interpreter",
       "excerpt": "Markdown is a plain text formatting syntax designed so that it can be converted to HTML. Apache Zeppelin uses markdown4j."
@@ -34,6 +34,17 @@
     
   
 
+    "/interpreter/submarine.html": {
+      "title": "Apache Hadoop Submarine Interpreter for Apache Zeppelin",
+      "content"  : "&lt;!--Licensed under the Apache License, Version 2.0 (the &quot;License&quot;);you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an &quot;AS IS&quot; BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License.--&gt;Submarine Interpreter for Apache ZeppelinHadoop Submarine  is the latest machine learning framework subproject in the Hadoop 3.1 release. It allows Hadoop to support Tensorflow, MXNet, Caffe, Spark, etc. A variety of deep learning frameworks provide a full-featured system framework for machine learning algorithm development, distributed model training, model management, and model publishing, combined with hadoop&amp;#39;s intrinsic
  data storage and data processing capabilities to enable data scientists to Good mining and the value of the data.A deep learning algorithm project requires data acquisition, data processing, data cleaning, interactive visual programming adjustment parameters, algorithm testing, algorithm publishing, algorithm job scheduling, offline model training, model online services and many other processes and processes. Zeppelin is a web-based notebook that supports interactive data analysis. You can use SQL, Scala, Python, etc. to make data-driven, interactive, collaborative documents.You can use the more than 20 interpreters in zeppelin (for example: spark, hive, Cassandra, Elasticsearch, Kylin, HBase, etc.) to collect data, clean data, feature extraction, etc. in the data in Hadoop before completing the machine learning model training. The data preprocessing process.By integrating submarine in zeppelin, we use zeppelin&amp;#39;s data discovery, data analysis and data visualization and coll
 aboration capabilities to visualize the results of algorithm development and parameter adjustment during machine learning model training.ArchitectureAs shown in the figure above, how the Submarine develops and models the machine learning algorithms through Zeppelin is explained from the system architecture.After installing and deploying Hadoop 3.1+ and Zeppelin, submarine will create a fully separate Zeppelin Submarine interpreter Docker container for each user in YARN. This container contains the development and runtime environment for Tensorflow. Zeppelin Server connects to the Zeppelin Submarine interpreter Docker container in YARN. allows algorithmic engineers to perform algorithm development and data visualization in Tensorflow&amp;#39;s stand-alone environment in Zeppelin Notebook.After the algorithm is developed, the algorithm engineer can submit the algorithm directly to the YARN in offline transfer training in Zeppelin, real-time demonstration of model training with Submari
 ne&amp;#39;s TensorBoard for each algorithm engineer.You can not only complete the model training of the algorithm, but you can also use the more than twenty interpreters in Zeppelin. Complete the data preprocessing of the model, For example, you can perform data extraction, filtering, and feature extraction through the Spark interpreter in Zeppelin in the Algorithm Note.In the future, you can also use Zeppelin&amp;#39;s upcoming Workflow workflow orchestration service. You can complete Spark, Hive data processing and Tensorflow model training in one Note. It is organized into a workflow through visualization, etc., and the scheduling of jobs is performed in the production environment.OverviewAs shown in the figure above, from the internal implementation, how Submarine combines Zeppelin&amp;#39;s machine learning algorithm development and model training.The algorithm engineer created a Tensorflow notebook (left image) in Zeppelin by using Submarine interpreter.It is important to not
 e that you need to complete the development of the entire algorithm in a Note.You can use Spark for data preprocessing in some of the paragraphs in Note.Use Python for algorithm development and debugging of Tensorflow in other paragraphs of notebook, Submarine creates a Zeppelin Submarine Interpreter Docker Container for you in YARN, which contains the following features and services:Shell Command line toolï¼Allows you to view the system environment in the Zeppelin Submarine Interpreter Docker Container, Install the extension tools you need or the Python dependencies.Kerberos libï¼Allows you to perform kerberos authentication and access to Hadoop clusters with Kerberos authentication enabled.Tensorflow environmentï¼Allows you to develop tensorflow algorithm code.Python environmentï¼Allows you to develop tensorflow code.Complete a complete algorithm development with a Note in Zeppelin. If this algorithm contains multiple modules, You can write different algorithm modu
 les in multiple paragraphs in Note. The title of each paragraph is the name of the algorithm module. The content of the paragraph is the code content of this algorithm module.HDFS Clientï¼Zeppelin Submarine Interpreter will automatically submit the algorithm code you wrote in Note to HDFS.Submarine interpreter Docker Image It is Submarine that provides you with an image file that supports Tensorflow (CPU and GPU versions).And installed the algorithm library commonly used by Python.You can also install other development dependencies you need on top of the base image provided by Submarine.When you complete the development of the algorithm module, You can do this by creating a new paragraph in Note and typing %submarine dashboard. Zeppelin will create a Submarine Dashboard. The machine learning algorithm written in this Note can be submitted to YARN as a JOB by selecting the JOB RUN command option in the Control Panel. Create a Tensorflow Model Training Docker Container, The contai
 ner contains the following sections:Tensorflow environmentHDFS Client Will automatically download the algorithm file Mount from HDFS into the container for distributed model training. Mount the algorithm file to the Work Dir path of the container.Submarine Tensorflow Docker Image There is Submarine that provides you with an image file that supports Tensorflow (CPU and GPU versions). And installed the algorithm library commonly used by Python. You can also install other development dependencies you need on top of the base image provided by Submarine.      Name    Class    Description        %submarine    SubmarineInterpreter    Provides interpreter for Apache Submarine dashboard        %submarine.sh    SubmarineShellInterpreter    Provides interpreter for Apache Submarine shell        %submarine.python    PySubmarineInterpreter    Provides interpreter for Apache Submarine python  Submarine shellAfter creating a Note with Submarine Interpreter in Zeppelin, You can add a paragraph to N
 ote if you need it. Using the %submarine.sh identifier, you can use the Shell command to perform various operations on the Submarine Interpreter Docker Container, such as:View the Pythone version in the ContainerView the system environment of the ContainerInstall the dependencies you need yourselfKerberos certification with kinitUse Hadoop in Container for HDFS operations, etc.Submarine pythonYou can add one or more paragraphs to Note. Write the algorithm module for Tensorflow in Python using the %submarine.python identifier.Submarine DashboardAfter writing the Tensorflow algorithm by using %submarine.python, You can add a paragraph to Note. Enter the %submarine dashboard and execute it. Zeppelin will create a Submarine Dashboard.With Submarine Dashboard you can do all the operational control of Submarine, for example:Usageï¼Display Submarine&amp;#39;s command description to help developers locate problems.Refreshï¼Zeppelin will erase all your input in the Dashboard.Tensorbo
 ardï¼You will be redirected to the Tensorboard WEB system created by Submarine for each user. With Tensorboard you can view the real-time status of the Tensorflow model training in real time.CommandJOB RUNï¼Selecting JOB RUN will display the parameter input interface for submitting JOB.      Name    Description        Checkpoint Path/td&gt;    Submarine sets up a separate Checkpoint path for each user&#39;s Note for Tensorflow training. Saved the training data for this Note history, Used to train the output of model data, Tensorboard uses the data in this path for model presentation. Users cannot modify it. For example: `hdfs://cluster1/...` , The environment variable name for Checkpoint Path is `%checkpoint_path%`, You can use `%checkpoint_path%` instead of the input value in Data Path in `PS Launch Cmd` and `Worker Launch Cmd`.        Input Path    The user specifies the data data directory of the Tensorflow algorithm. Only HDFS-enabled directories are supported. The envir
 onment variable name for Data Path is `%input_path%`, You can use `%input_path%` instead of the input value in Data Path in `PS Launch Cmd` and `Worker Launch Cmd`.        PS Launch Cmd    Tensorflow Parameter services launch commandï¼ä¾å¦ï¼`python cifar10_main.py --data-dir=%input_path% --job-dir=%checkpoint_path% --num-gpus=0 ...`        Worker Launch Cmd    Tensorflow Worker services launch commandï¼ä¾å¦ï¼`python cifar10_main.py --data-dir=%input_path% --job-dir=%checkpoint_path% --num-gpus=1 ...`  JOB STOPYou can choose to execute the JOB STOP command. Stop a Tensorflow model training task that has been submitted and is runningTENSORBOARD STARTYou can choose to execute the TENSORBOARD START command to create your TENSORBOARD Docker Container.TENSORBOARD STOPYou can choose to execute the TENSORBOARD STOP command to stop and destroy your TENSORBOARD Docker Container.Run Commandï¼Execute the action command of your choiceClean Chechkpointï¼Che
 cking this option will clear the data in this Note&amp;#39;s Checkpoint Path before each JOB RUN execution.ConfigurationZeppelin Submarine interpreter provides the following properties to customize the Submarine interpreter      Attribute name    Attribute value    Description        DOCKER_CONTAINER_TIME_ZONE    Etc/UTC    Set the time zone in the container                           |        DOCKER_HADOOP_HDFS_HOME    /hadoop-3.1-0    Hadoop path in the following 3 imagesï¼SUBMARINE_INTERPRETER_DOCKER_IMAGEãtf.parameter.services.docker.imageãtf.worker.services.docker.imageï¼ |        DOCKER_JAVA_HOME    /opt/java    JAVA path in the following 3 imagesï¼SUBMARINE_INTERPRETER_DOCKER_IMAGEãtf.parameter.services.docker.imageãtf.worker.services.docker.imageï¼ |        HADOOP_YARN_SUBMARINE_JAR        Path to the Submarine JAR package in the Hadoop-3.1+ release installed on the Zeppelin server |        INTERPRETER_LAUNCH_MODE    local/yarn    Run the S
 ubmarine interpreter instance in local or YARN local mainly for submarine interpreter development and debugging YARN mode for production environment |        SUBMARINE_HADOOP_CONF_DIR        Set the HADOOP-CONF path to support multiple Hadoop cluster environments        SUBMARINE_HADOOP_HOME        Hadoop-3.1+ above path installed on the Zeppelin server        SUBMARINE_HADOOP_KEYTAB        Keytab file path for a hadoop cluster with kerberos authentication turned on        SUBMARINE_HADOOP_PRINCIPAL        PRINCIPAL information for the keytab file of the hadoop cluster with kerberos authentication turned on        SUBMARINE_INTERPRETER_DOCKER_IMAGE        At INTERPRETER_LAUNCH_MODE=yarn, Submarine uses this image to create a Zeppelin Submarine interpreter container to create an algorithm development environment for the user. |        docker.container.network        YARN&#39;s Docker network name        machinelearing.distributed.enable        Whether to use the model training of the
  distributed mode JOB RUN submission        shell.command.timeout.millisecs    60000    Execute timeout settings for shell commands in the Submarine interpreter container        submarine.algorithm.hdfs.path        Save machine-based algorithms developed using Submarine interpreter to HDFS as files        submarine.yarn.queue    root.default    Submarine submits model training YARN queue name        tf.checkpoint.path        Tensorflow checkpoint path, Each user will create a user&#39;s checkpoint secondary path using the username under this path. Each algorithm submitted by the user will create a checkpoint three-level path using the note id (the user&#39;s Tensorboard uses the checkpoint data in this path for visual display)        tf.parameter.services.cpu        Number of CPU cores applied to Tensorflow parameter services when Submarine submits model distributed training        tf.parameter.services.docker.image        Submarine creates a mirror for Tensorflow parameter services
  when submitting model distributed training        tf.parameter.services.gpu        GPU cores applied to Tensorflow parameter services when Submarine submits model distributed training        tf.parameter.services.memory    2G    Memory resources requested by Tensorflow parameter services when Submarine submits model distributed training        tf.parameter.services.num        Number of Tensorflow parameter services used by Submarine to submit model distributed training        tf.tensorboard.enable    true    Create a separate Tensorboard for each user        tf.worker.services.cpu        Submarine submits model resources for Tensorflow worker services when submitting model training        tf.worker.services.docker.image        Submarine creates a mirror for Tensorflow worker services when submitting model distributed training        tf.worker.services.gpu        Submarine submits GPU resources for Tensorflow worker services when submitting model training        tf.worker.services.m
 emory        Submarine submits model resources for Tensorflow worker services when submitting model training        tf.worker.services.num        Number of Tensorflow worker services used by Submarine to submit model distributed training        yarn.webapp.http.address    http://hadoop:8088    YARN web ui address        zeppelin.interpreter.rpc.portRange    29914    You need to export this port in the SUBMARINE_INTERPRETER_DOCKER_IMAGE configuration image. RPC communication for Zeppelin Server and Submarine interpreter containers        zeppelin.ipython.grpc.message_size    33554432    Message size setting for IPython grpc in Submarine interpreter container        zeppelin.ipython.launch.timeout    30000    IPython execution timeout setting in Submarine interpreter container        zeppelin.python    python    Execution path of python in Submarine interpreter container        zeppelin.python.maxResult    10000    The maximum number of python execution results returned from the Subma
 rine interpreter container        zeppelin.python.useIPython    false    IPython is currently not supported and must be false        zeppelin.submarine.auth.type    simple/kerberos    Has Hadoop turned on kerberos authentication?  Docker imagesThe docker images file is stored in the zeppelin/scripts/docker/submarine directory.submarine interpreter cpu versionsubmarine interpreter gpu versiontensorflow 1.10 &amp;amp; hadoop 3.1.2 cpu versiontensorflow 1.10 &amp;amp; hadoop 3.1.2 gpu versionChange Log0.1.0 (Zeppelin 0.9.0) :Support distributed or standolone tensorflow model training.Support submarine interpreter running local.Support submarine interpreter running YARN.Support Docker on YARN-3.3.0, Plan compatible with lower versions of yarn.Bugs &amp;amp; ContactsSubmarine interpreter BUGIf you encounter a bug for this interpreter, please create a sub JIRA ticket on ZEPPELIN-3856.Submarine Running problemIf you encounter a problem for Submarine runtime, please create a ISSUE on hadoop
 -submarine-ecosystem.YARN Submarine BUGIf you encounter a bug for Yarn Submarine, please create a JIRA ticket on SUBMARINE.DependencyYARNSubmarine currently need to run on Hadoop 3.3+The hadoop version of the hadoop submarine team git repository is periodically submitted to the code repository of the hadoop.The version of the git repository for the hadoop submarine team will be faster than the hadoop version release cycle.You can use the hadoop version of the hadoop submarine team git repository.Submarine runtime environmentyou can use Submarine-installer https://github.com/hadoopsubmarine, Deploy Docker and network environments.MoreHadoop Submarine Project: https://hadoop.apache.org/submarineYoutube Submarine Channel: https://www.youtube.com/channel/UC4JBt8Y8VJ0BW0IM9YpdCyQ",
+      "url": " /interpreter/submarine.html",
+      "group": "interpreter",
+      "excerpt": "Hadoop Submarine is the latest machine learning framework subproject in the Hadoop 3.1 release. It allows Hadoop to support Tensorflow, MXNet, Caffe, Spark, etc."
+    }
+    ,
+    
+  
+
     "/interpreter/mahout.html": {
       "title": "Mahout Interpreter for Apache Zeppelin",
       "content"  : "&lt;!--Licensed under the Apache License, Version 2.0 (the &quot;License&quot;);you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an &quot;AS IS&quot; BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License.--&gt;Apache Mahout Interpreter for Apache ZeppelinInstallationApache Mahout is a collection of packages that enable machine learning and matrix algebra on underlying engines such as Apache Flink or Apache Spark.  A convenience script for creating and configuring two Mahout enabled interpreters exists.  The %sparkMahout and %flinkMahout interpreters do not exist by default but can be easily created using this script.  Easy InstallationTo
  quickly and easily get up and running using Apache Mahout, run the following command from the top-level directory of the Zeppelin install:python scripts/mahout/add_mahout.pyThis will create the %sparkMahout and %flinkMahout interpreters, and restart Zeppelin.Advanced InstallationThe add_mahout.py script contains several command line arguments for advanced users.      Argument    Description    Example        --zeppelin_home    This is the path to the Zeppelin installation.  This flag is not needed if the script is run from the top-level installation directory or from the zeppelin/scripts/mahout directory.    /path/to/zeppelin        --mahout_home    If the user has already installed Mahout, this flag can set the path to MAHOUT_HOME.  If this is set, downloading Mahout will be skipped.    /path/to/mahout_home        --restart_later    Restarting is necessary for updates to take effect. By default the script will restart Zeppelin for you. Restart will be skipped if this flag is set. 
    NA        --force_download    This flag will force the script to re-download the binary even if it already exists.  This is useful for previously failed downloads.    NA          --overwrite_existing      This flag will force the script to overwrite existing %sparkMahout and %flinkMahout interpreters. Useful when you want to just start over.      NA    NOTE 1: Apache Mahout at this time only supports Spark 1.5 and Spark 1.6 and Scala 2.10.  If the user is using another version of Spark (e.g. 2.0), the %sparkMahout will likely not work.  The %flinkMahout interpreter will still work and the user is encouraged to develop with that engine as the code can be ported via copy and paste, as is evidenced by the tutorial notebook.NOTE 2: If using Apache Flink in cluster mode, the following libraries will also need to be coppied to ${FLINK_HOME}/lib- mahout-math-0.12.2.jar- mahout-math-scala2.10-0.12.2.jar- mahout-flink2.10-0.12.2.jar- mahout-hdfs-0.12.2.jar- com.google.guava:guava:14.0.1Ov
 erviewThe Apache Mahoutâ¢ project&amp;#39;s goal is to build an environment for quickly creating scalable performant machine learning applications.Apache Mahout software provides three major features:A simple and extensible programming environment and framework for building scalable algorithmsA wide variety of premade algorithms for Scala + Apache Spark, H2O, Apache FlinkSamsara, a vector math experimentation environment with R-like syntax which works at scaleIn other words:Apache Mahout provides a unified API for quickly creating machine learning algorithms on a variety of engines.How to useWhen starting a session with Apache Mahout, depending on which engine you are using (Spark or Flink), a few imports must be made and a Distributed Context must be declared.  Copy and paste the following code and run once to get started.Flink%flinkMahoutimport org.apache.flink.api.scala._import org.apache.mahout.math.drm._import org.apache.mahout.math.drm.RLikeDrmOps._import org.apache.mahout
 .flinkbindings._import org.apache.mahout.math._import scalabindings._import RLikeOps._implicit val ctx = new FlinkDistributedContext(benv)Spark%sparkMahoutimport org.apache.mahout.math._import org.apache.mahout.math.scalabindings._import org.apache.mahout.math.drm._import org.apache.mahout.math.scalabindings.RLikeOps._import org.apache.mahout.math.drm.RLikeDrmOps._import org.apache.mahout.sparkbindings._implicit val sdc: org.apache.mahout.sparkbindings.SparkDistributedContext = sc2sdc(sc)Same Code, Different EnginesAfter importing and setting up the distributed context, the Mahout R-Like DSL is consistent across engines.  The following code will run in both %flinkMahout and %sparkMahoutval drmData = drmParallelize(dense(  (2, 2, 10.5, 10, 29.509541),  // Apple Cinnamon Cheerios  (1, 2, 12,   12, 18.042851),  // Cap&amp;#39;n&amp;#39;Crunch  (1, 1, 12,   13, 22.736446),  // Cocoa Puffs  (2, 1, 11,   13, 32.207582),  // Froot Loops  (1, 2, 12,   11, 21.871292),  // Honey Graham Ohs  (
 2, 1, 16,   8,  36.187559),  // Wheaties Honey Gold  (6, 2, 17,   1,  50.764999),  // Cheerios  (3, 2, 13,   7,  40.400208),  // Clusters  (3, 3, 13,   4,  45.811716)), numPartitions = 2)drmData.collect(::, 0 until 4)val drmX = drmData(::, 0 until 4)val y = drmData.collect(::, 4)val drmXtX = drmX.t %*% drmXval drmXty = drmX.t %*% yval XtX = drmXtX.collectval Xty = drmXty.collect(::, 0)val beta = solve(XtX, Xty)Leveraging Resource Pools and R for VisualizationResource Pools are a powerful Zeppelin feature that lets us share information between interpreters. A fun trick is to take the output of our work in Mahout and analyze it in other languages.Setting up a Resource Pool in FlinkIn Spark based interpreters resource pools are accessed via the ZeppelinContext API.  To put and get things from the resource pool one can be done simpleval myVal = 1z.put(&amp;quot;foo&amp;quot;, myVal)val myFetchedVal = z.get(&amp;quot;foo&amp;quot;)To add this functionality to a Flink based interpreter we
  declare the follwoing%flinkMahoutimport org.apache.zeppelin.interpreter.InterpreterContextval z = InterpreterContext.get().getResourcePool()Now we can access the resource pool in a consistent manner from the %flinkMahout interpreter.Passing a variable from Mahout to R and PlottingIn this simple example, we use Mahout (on Flink or Spark, the code is the same) to create a random matrix and then take the Sin of each element. We then randomly sample the matrix and create a tab separated string. Finally we pass that string to R where it is read as a .tsv file, and a DataFrame is created and plotted using native R plotting libraries.val mxRnd = Matrices.symmetricUniformView(5000, 2, 1234)val drmRand = drmParallelize(mxRnd)val drmSin = drmRand.mapBlock() {case (keys, block) =&amp;gt;    val blockB = block.like()  for (i &amp;lt;- 0 until block.nrow) {    blockB(i, 0) = block(i, 0)    blockB(i, 1) = Math.sin((block(i, 0) * 8))  }  keys -&amp;gt; blockB}z.put(&amp;quot;sinDrm&amp;quot;, org
 .apache.mahout.math.drm.drmSampleToTSV(drmSin, 0.85))And then in an R paragraph...%spark.r {&amp;quot;imageWidth&amp;quot;: &amp;quot;400px&amp;quot;}library(&amp;quot;ggplot2&amp;quot;)sinStr = z.get(&amp;quot;flinkSinDrm&amp;quot;)data &amp;lt;- read.table(text= sinStr, sep=&amp;quot;t&amp;quot;, header=FALSE)plot(data,  col=&amp;quot;red&amp;quot;)",
@@ -47,7 +58,7 @@
 
     "/interpreter/spark.html": {
       "title": "Apache Spark Interpreter for Apache Zeppelin",
-      "content"  : "&lt;!--Licensed under the Apache License, Version 2.0 (the &quot;License&quot;);you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an &quot;AS IS&quot; BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License.--&gt;Spark Interpreter for Apache ZeppelinOverviewApache Spark is a fast and general-purpose cluster computing system.It provides high-level APIs in Java, Scala, Python and R, and an optimized engine that supports general execution graphs.Apache Spark is supported in Zeppelin with Spark interpreter group which consists of below five interpreters.      Name    Class    Description        %spark    SparkInterpreter    Creates a SparkConte
 xt and provides a Scala environment        %spark.pyspark    PySparkInterpreter    Provides a Python environment        %spark.r    SparkRInterpreter    Provides an R environment with SparkR support        %spark.sql    SparkSQLInterpreter    Provides a SQL environment        %spark.dep    DepInterpreter    Dependency loader  ConfigurationThe Spark interpreter can be configured with properties provided by Zeppelin.You can also set other Spark properties which are not listed in the table. For a list of additional properties, refer to Spark Available Properties.      Property    Default    Description        args        Spark commandline args      master    local[*]    Spark master uri.  ex) spark://masterhost:7077      spark.app.name    Zeppelin    The name of spark application.        spark.cores.max        Total number of cores to use.  Empty value uses all available core.        spark.executor.memory     1g    Executor memory per worker instance.  ex) 512m, 32g        zeppelin.dep
 .additionalRemoteRepository    spark-packages,  http://dl.bintray.com/spark-packages/maven,  false;    A list of id,remote-repository-URL,is-snapshot;  for each remote repository.        zeppelin.dep.localrepo    local-repo    Local repository for dependency loader        PYSPARKPYTHON    python    Python binary executable to use for PySpark in both driver and workers (default is python).            Property spark.pyspark.python take precedence if it is set        PYSPARKDRIVERPYTHON    python    Python binary executable to use for PySpark in driver only (default is PYSPARKPYTHON).            Property spark.pyspark.driver.python take precedence if it is set        zeppelin.spark.concurrentSQL    false    Execute multiple SQL concurrently if set true.        zeppelin.spark.concurrentSQL.max    10    Max number of SQL concurrently executed        zeppelin.spark.maxResult    1000    Max number of Spark SQL result to display.        zeppelin.spark.printREPLOutput    true    Print REPL o
 utput        zeppelin.spark.useHiveContext    true    Use HiveContext instead of SQLContext if it is true.        zeppelin.spark.importImplicit    true    Import implicits, UDF collection, and sql if set true.        zeppelin.spark.enableSupportedVersionCheck    true    Do not change - developer only setting, not for production use        zeppelin.spark.sql.interpolation    false    Enable ZeppelinContext variable interpolation into paragraph text      zeppelin.spark.uiWebUrl        Overrides Spark UI default URL. Value should be a full URL (ex: http://{hostName}/{uniquePath}    zeppelin.spark.scala.color    true    Whether to enable color output of spark scala interpreter  Without any configuration, Spark interpreter works out of box in local mode. But if you want to connect to your Spark cluster, you&amp;#39;ll need to follow below two simple steps.1. Export SPARK_HOMEIn conf/zeppelin-env.sh, export SPARK_HOME environment variable with your Spark installation path.For example,expo
 rt SPARK_HOME=/usr/lib/sparkYou can optionally set more environment variables# set hadoop conf direxport HADOOP_CONF_DIR=/usr/lib/hadoop# set options to pass spark-submit commandexport SPARK_SUBMIT_OPTIONS=&amp;quot;--packages com.databricks:spark-csv_2.10:1.2.0&amp;quot;# extra classpath. e.g. set classpath for hive-site.xmlexport ZEPPELIN_INTP_CLASSPATH_OVERRIDES=/etc/hive/confFor Windows, ensure you have winutils.exe in %HADOOP_HOME%bin. Please see Problems running Hadoop on Windows for the details.2. Set master in Interpreter menuAfter start Zeppelin, go to Interpreter menu and edit master property in your Spark interpreter setting. The value may vary depending on your Spark cluster deployment type.For example,local[*] in local modespark://master:7077 in standalone clusteryarn-client in Yarn client modeyarn-cluster in Yarn cluster modemesos://host:5050 in Mesos clusterThat&amp;#39;s it. Zeppelin will work with any version of Spark and any deployment type without rebuilding Zeppe
 lin in this way.For the further information about Spark &amp;amp; Zeppelin version compatibility, please refer to &amp;quot;Available Interpreters&amp;quot; section in Zeppelin download page.Note that without exporting SPARK_HOME, it&amp;#39;s running in local mode with included version of Spark. The included version may vary depending on the build profile.3. Yarn modeZeppelin support both yarn client and yarn cluster mode (yarn cluster mode is supported from 0.8.0). For yarn mode, you must specify SPARK_HOME &amp;amp; HADOOP_CONF_DIR.You can either specify them in zeppelin-env.sh, or in interpreter setting page. Specifying them in zeppelin-env.sh means you can use only one version of spark &amp;amp; hadoop. Specifying themin interpreter setting page means you can use multiple versions of spark &amp;amp; hadoop in one zeppelin instance.4. New Version of SparkInterpreterThere&amp;#39;s one new version of SparkInterpreter with better spark support and code completion starting from Zep
 pelin 0.8.0. We enable it by default, but user can still use the old version of SparkInterpreter by setting zeppelin.spark.useNew as false in its interpreter setting.SparkContext, SQLContext, SparkSession, ZeppelinContextSparkContext, SQLContext and ZeppelinContext are automatically created and exposed as variable names sc, sqlContext and z, respectively, in Scala, Python and R environments.Staring from 0.6.1 SparkSession is available as variable spark when you are using Spark 2.x.Note that Scala/Python/R environment shares the same SparkContext, SQLContext and ZeppelinContext instance. How to pass property to SparkConfThere&amp;#39;re 2 kinds of properties that would be passed to SparkConfStandard spark property (prefix with spark.). e.g. spark.executor.memory will be passed to SparkConfNon-standard spark property (prefix with zeppelin.spark.).  e.g. zeppelin.spark.property_1, property_1 will be passed to SparkConfDependency ManagementThere are two ways to load external libraries i
 n Spark interpreter. First is using interpreter setting menu and second is loading Spark properties.1. Setting Dependencies via Interpreter SettingPlease see Dependency Management for the details.2. Loading Spark PropertiesOnce SPARK_HOME is set in conf/zeppelin-env.sh, Zeppelin uses spark-submit as spark interpreter runner. spark-submit supports two ways to load configurations.The first is command line options such as --master and Zeppelin can pass these options to spark-submit by exporting SPARK_SUBMIT_OPTIONS in conf/zeppelin-env.sh. Second is reading configuration options from SPARK_HOME/conf/spark-defaults.conf. Spark properties that user can set to distribute libraries are:      spark-defaults.conf    SPARK_SUBMIT_OPTIONS    Description        spark.jars    --jars    Comma-separated list of local jars to include on the driver and executor classpaths.        spark.jars.packages    --packages    Comma-separated list of maven coordinates of jars to include on the driver and execu
 tor classpaths. Will search the local maven repo, then maven central and any additional remote repositories given by --repositories. The format for the coordinates should be groupId:artifactId:version.        spark.files    --files    Comma-separated list of files to be placed in the working directory of each executor.  Here are few examples:SPARK_SUBMIT_OPTIONS in conf/zeppelin-env.shexport SPARK_SUBMIT_OPTIONS=&amp;quot;--packages com.databricks:spark-csv_2.10:1.2.0 --jars /path/mylib1.jar,/path/mylib2.jar --files /path/mylib1.py,/path/mylib2.zip,/path/mylib3.egg&amp;quot;SPARK_HOME/conf/spark-defaults.confspark.jars        /path/mylib1.jar,/path/mylib2.jarspark.jars.packages   com.databricks:spark-csv_2.10:1.2.0spark.files       /path/mylib1.py,/path/mylib2.egg,/path/mylib3.zip3. Dynamic Dependency Loading via %spark.dep interpreterNote: %spark.dep interpreter loads libraries to %spark and %spark.pyspark but not to  %spark.sql interpreter. So we recommend you to use the first opt
 ion instead.When your code requires external library, instead of doing download/copy/restart Zeppelin, you can easily do following jobs using %spark.dep interpreter.Load libraries recursively from maven repositoryLoad libraries from local filesystemAdd additional maven repositoryAutomatically add libraries to SparkCluster (You can turn off)Dep interpreter leverages Scala environment. So you can write any Scala code here.Note that %spark.dep interpreter should be used before %spark, %spark.pyspark, %spark.sql.Here&amp;#39;s usages.%spark.depz.reset() // clean up previously added artifact and repository// add maven repositoryz.addRepo(&amp;quot;RepoName&amp;quot;).url(&amp;quot;RepoURL&amp;quot;)// add maven snapshot repositoryz.addRepo(&amp;quot;RepoName&amp;quot;).url(&amp;quot;RepoURL&amp;quot;).snapshot()// add credentials for private maven repositoryz.addRepo(&amp;quot;RepoName&amp;quot;).url(&amp;quot;RepoURL&amp;quot;).username(&amp;quot;username&amp;quot;).password(&amp;quot;p
 assword&amp;quot;)// add artifact from filesystemz.load(&amp;quot;/path/to.jar&amp;quot;)// add artifact from maven repository, with no dependencyz.load(&amp;quot;groupId:artifactId:version&amp;quot;).excludeAll()// add artifact recursivelyz.load(&amp;quot;groupId:artifactId:version&amp;quot;)// add artifact recursively except comma separated GroupID:ArtifactId listz.load(&amp;quot;groupId:artifactId:version&amp;quot;).exclude(&amp;quot;groupId:artifactId,groupId:artifactId, ...&amp;quot;)// exclude with patternz.load(&amp;quot;groupId:artifactId:version&amp;quot;).exclude(*)z.load(&amp;quot;groupId:artifactId:version&amp;quot;).exclude(&amp;quot;groupId:artifactId:*&amp;quot;)z.load(&amp;quot;groupId:artifactId:version&amp;quot;).exclude(&amp;quot;groupId:*&amp;quot;)// local() skips adding artifact to spark clusters (skipping sc.addJar())z.load(&amp;quot;groupId:artifactId:version&amp;quot;).local()ZeppelinContextZeppelin automatically injects ZeppelinContext as variable z in your
  Scala/Python environment. ZeppelinContext provides some additional functions and utilities.See Zeppelin-Context for more details.Matplotlib Integration (pyspark)Both the python and pyspark interpreters have built-in support for inline visualization using matplotlib,a popular plotting library for python. More details can be found in the python interpreter documentation,since matplotlib support is identical. More advanced interactive plotting can be done with pyspark throughutilizing Zeppelin&amp;#39;s built-in Angular Display System, as shown below:Running spark sql concurrentlyBy default, each sql statement would run sequentially in %spark.sql. But you can run them concurrently by following setup.set zeppelin.spark.concurrentSQL to true to enable the sql concurrent feature, underneath zeppelin will change to use fairscheduler for spark. And also set zeppelin.spark.concurrentSQL.max to control the max number of sql statements running concurrently.configure pools by creating fairsche
 duler.xml under your SPARK_CONF_DIR, check the offical spark doc Configuring Pool Propertiesset pool property via setting paragraph property. e.g.%spark(pool=pool1)sql statementThis feature is available for both all versions of scala spark, pyspark. For sparkr, it is only available starting from 2.3.0.Interpreter setting optionYou can choose one of shared, scoped and isolated options wheh you configure Spark interpreter.Spark interpreter creates separated Scala compiler per each notebook but share a single SparkContext in scoped mode (experimental).It creates separated SparkContext per each notebook in isolated mode.IPython supportBy default, zeppelin would use IPython in pyspark when IPython is available, Otherwise it would fall back to the original PySpark implementation.If you don&amp;#39;t want to use IPython, then you can set zeppelin.pyspark.useIPython as false in interpreter setting. For the IPython features, you can refer docPython InterpreterSetting up Zeppelin with Kerbero
 sLogical setup with Zeppelin, Kerberos Key Distribution Center (KDC), and Spark on YARN:Configuration SetupOn the server that Zeppelin is installed, install Kerberos client modules and configuration, krb5.conf.This is to make the server communicate with KDC.Set SPARK_HOME in [ZEPPELIN_HOME]/conf/zeppelin-env.sh to use spark-submit(Additionally, you might have to set export HADOOP_CONF_DIR=/etc/hadoop/conf)Add the two properties below to Spark configuration ([SPARK_HOME]/conf/spark-defaults.conf):spark.yarn.principalspark.yarn.keytabNOTE: If you do not have permission to access for the above spark-defaults.conf file, optionally, you can add the above lines to the Spark Interpreter setting through the Interpreter tab in the Zeppelin UI.That&amp;#39;s it. Play with Zeppelin!",
+      "content"  : "&lt;!--Licensed under the Apache License, Version 2.0 (the &quot;License&quot;);you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an &quot;AS IS&quot; BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License.--&gt;Spark Interpreter for Apache ZeppelinOverviewApache Spark is a fast and general-purpose cluster computing system.It provides high-level APIs in Java, Scala, Python and R, and an optimized engine that supports general execution graphs.Apache Spark is supported in Zeppelin with Spark interpreter group which consists of below five interpreters.      Name    Class    Description        %spark    SparkInterpreter    Creates a SparkConte
 xt and provides a Scala environment        %spark.pyspark    PySparkInterpreter    Provides a Python environment        %spark.r    SparkRInterpreter    Provides an R environment with SparkR support        %spark.sql    SparkSQLInterpreter    Provides a SQL environment        %spark.dep    DepInterpreter    Dependency loader  ConfigurationThe Spark interpreter can be configured with properties provided by Zeppelin.You can also set other Spark properties which are not listed in the table. For a list of additional properties, refer to Spark Available Properties.      Property    Default    Description        args        Spark commandline args      master    local[*]    Spark master uri.  ex) spark://masterhost:7077      spark.app.name    Zeppelin    The name of spark application.        spark.cores.max        Total number of cores to use.  Empty value uses all available core.        spark.executor.memory     1g    Executor memory per worker instance.  ex) 512m, 32g        zeppelin.dep
 .additionalRemoteRepository    spark-packages,  http://dl.bintray.com/spark-packages/maven,  false;    A list of id,remote-repository-URL,is-snapshot;  for each remote repository.        zeppelin.dep.localrepo    local-repo    Local repository for dependency loader        PYSPARK_PYTHON    python    Python binary executable to use for PySpark in both driver and workers (default is python).            Property spark.pyspark.python take precedence if it is set        PYSPARK_DRIVER_PYTHON    python    Python binary executable to use for PySpark in driver only (default is PYSPARK_PYTHON).            Property spark.pyspark.driver.python take precedence if it is set        zeppelin.spark.concurrentSQL    false    Execute multiple SQL concurrently if set true.        zeppelin.spark.concurrentSQL.max    10    Max number of SQL concurrently executed        zeppelin.spark.maxResult    1000    Max number of Spark SQL result to display.        zeppelin.spark.printREPLOutput    true    Print RE
 PL output        zeppelin.spark.useHiveContext    true    Use HiveContext instead of SQLContext if it is true.        zeppelin.spark.importImplicit    true    Import implicits, UDF collection, and sql if set true.        zeppelin.spark.enableSupportedVersionCheck    true    Do not change - developer only setting, not for production use        zeppelin.spark.sql.interpolation    false    Enable ZeppelinContext variable interpolation into paragraph text      zeppelin.spark.uiWebUrl        Overrides Spark UI default URL. Value should be a full URL (ex: http://{hostName}/{uniquePath}    zeppelin.spark.scala.color    true    Whether to enable color output of spark scala interpreter  Without any configuration, Spark interpreter works out of box in local mode. But if you want to connect to your Spark cluster, you&amp;#39;ll need to follow below two simple steps.1. Export SPARK_HOMEIn conf/zeppelin-env.sh, export SPARK_HOME environment variable with your Spark installation path.For example,
 export SPARK_HOME=/usr/lib/sparkYou can optionally set more environment variables# set hadoop conf direxport HADOOP_CONF_DIR=/usr/lib/hadoop# set options to pass spark-submit commandexport SPARK_SUBMIT_OPTIONS=&amp;quot;--packages com.databricks:spark-csv_2.10:1.2.0&amp;quot;# extra classpath. e.g. set classpath for hive-site.xmlexport ZEPPELIN_INTP_CLASSPATH_OVERRIDES=/etc/hive/confFor Windows, ensure you have winutils.exe in %HADOOP_HOME%bin. Please see Problems running Hadoop on Windows for the details.2. Set master in Interpreter menuAfter start Zeppelin, go to Interpreter menu and edit master property in your Spark interpreter setting. The value may vary depending on your Spark cluster deployment type.For example,local[*] in local modespark://master:7077 in standalone clusteryarn-client in Yarn client modeyarn-cluster in Yarn cluster modemesos://host:5050 in Mesos clusterThat&amp;#39;s it. Zeppelin will work with any version of Spark and any deployment type without rebuilding Z
 eppelin in this way.For the further information about Spark &amp;amp; Zeppelin version compatibility, please refer to &amp;quot;Available Interpreters&amp;quot; section in Zeppelin download page.Note that without exporting SPARK_HOME, it&amp;#39;s running in local mode with included version of Spark. The included version may vary depending on the build profile.3. Yarn modeZeppelin support both yarn client and yarn cluster mode (yarn cluster mode is supported from 0.8.0). For yarn mode, you must specify SPARK_HOME &amp;amp; HADOOP_CONF_DIR.You can either specify them in zeppelin-env.sh, or in interpreter setting page. Specifying them in zeppelin-env.sh means you can use only one version of spark &amp;amp; hadoop. Specifying themin interpreter setting page means you can use multiple versions of spark &amp;amp; hadoop in one zeppelin instance.4. New Version of SparkInterpreterStarting from 0.9, we totally removed the old spark interpreter implementation, and make the new spark interpre
 ter as the official spark interpreter.SparkContext, SQLContext, SparkSession, ZeppelinContextSparkContext, SQLContext and ZeppelinContext are automatically created and exposed as variable names sc, sqlContext and z, respectively, in Scala, Python and R environments.Staring from 0.6.1 SparkSession is available as variable spark when you are using Spark 2.x.Note that Scala/Python/R environment shares the same SparkContext, SQLContext and ZeppelinContext instance. How to pass property to SparkConfThere&amp;#39;re 2 kinds of properties that would be passed to SparkConfStandard spark property (prefix with spark.). e.g. spark.executor.memory will be passed to SparkConfNon-standard spark property (prefix with zeppelin.spark.).  e.g. zeppelin.spark.property_1, property_1 will be passed to SparkConfDependency ManagementFor spark interpreter, you should not use Zeppelin&amp;#39;s Dependency Management for managing third party dependencies, (%spark.dep also is not the recommended approach star
 ting from Zeppelin 0.8). Instead you should set spark properties (spark.jars, spark.files, spark.jars.packages) in 2 ways.      spark-defaults.conf    SPARK_SUBMIT_OPTIONS    Description        spark.jars    --jars    Comma-separated list of local jars to include on the driver and executor classpaths.        spark.jars.packages    --packages    Comma-separated list of maven coordinates of jars to include on the driver and executor classpaths. Will search the local maven repo, then maven central and any additional remote repositories given by --repositories. The format for the coordinates should be groupId:artifactId:version.        spark.files    --files    Comma-separated list of files to be placed in the working directory of each executor.  1. Set spark properties in zeppelin side.In zeppelin side, you can either set them in spark interpreter setting page or via Generic ConfInterpreter.It is not recommended to set them in SPARK_SUBMIT_OPTIONS. Because it will be shared by all spar
 k interpreters, you can not set different dependencies for different users.2. Set spark properties in spark side.In spark side, you can set them in spark-defaults.conf.e.g.    spark.jars        /path/mylib1.jar,/path/mylib2.jar    spark.jars.packages   com.databricks:spark-csv_2.10:1.2.0    spark.files       /path/mylib1.py,/path/mylib2.egg,/path/mylib3.zipZeppelinContextZeppelin automatically injects ZeppelinContext as variable z in your Scala/Python environment. ZeppelinContext provides some additional functions and utilities.See Zeppelin-Context for more details.Matplotlib Integration (pyspark)Both the python and pyspark interpreters have built-in support for inline visualization using matplotlib,a popular plotting library for python. More details can be found in the python interpreter documentation,since matplotlib support is identical. More advanced interactive plotting can be done with pyspark throughutilizing Zeppelin&amp;#39;s built-in Angular Display System, as shown below:
 Running spark sql concurrentlyBy default, each sql statement would run sequentially in %spark.sql. But you can run them concurrently by following setup.set zeppelin.spark.concurrentSQL to true to enable the sql concurrent feature, underneath zeppelin will change to use fairscheduler for spark. And also set zeppelin.spark.concurrentSQL.max to control the max number of sql statements running concurrently.configure pools by creating fairscheduler.xml under your SPARK_CONF_DIR, check the offical spark doc Configuring Pool Propertiesset pool property via setting paragraph property. e.g.%spark(pool=pool1)sql statementThis feature is available for both all versions of scala spark, pyspark. For sparkr, it is only available starting from 2.3.0.Interpreter setting optionYou can choose one of shared, scoped and isolated options wheh you configure Spark interpreter.Spark interpreter creates separated Scala compiler per each notebook but share a single SparkContext in scoped mode (experimental).
 It creates separated SparkContext per each notebook in isolated mode.IPython supportBy default, zeppelin would use IPython in pyspark when IPython is available, Otherwise it would fall back to the original PySpark implementation.If you don&amp;#39;t want to use IPython, then you can set zeppelin.pyspark.useIPython as false in interpreter setting. For the IPython features, you can refer docPython InterpreterSetting up Zeppelin with KerberosLogical setup with Zeppelin, Kerberos Key Distribution Center (KDC), and Spark on YARN:Deprecate Spark 2.2 and earlier versionsStarting from 0.9, Zeppelin deprecate Spark 2.2 and earlier versions. So you will see a warning message when you use Spark 2.2 and earlier.You can get rid of this message by setting zeppelin.spark.deprecatedMsg.show to false.Configuration SetupOn the server that Zeppelin is installed, install Kerberos client modules and configuration, krb5.conf.This is to make the server communicate with KDC.Set SPARK_HOME in [ZEPPELIN_HOME
 ]/conf/zeppelin-env.sh to use spark-submit(Additionally, you might have to set export HADOOP_CONF_DIR=/etc/hadoop/conf)Add the two properties below to Spark configuration ([SPARK_HOME]/conf/spark-defaults.conf):spark.yarn.principalspark.yarn.keytabNOTE: If you do not have permission to access for the above spark-defaults.conf file, optionally, you can add the above lines to the Spark Interpreter setting through the Interpreter tab in the Zeppelin UI.That&amp;#39;s it. Play with Zeppelin!",
       "url": " /interpreter/spark.html",
       "group": "interpreter",
       "excerpt": "Apache Spark is a fast and general-purpose cluster computing system. It provides high-level APIs in Java, Scala, Python and R, and an optimized engine that supports general execution engine."
@@ -80,7 +91,7 @@
 
     "/interpreter/ignite.html": {
       "title": "Ignite Interpreter for Apache Zeppelin",
-      "content"  : "&lt;!--Licensed under the Apache License, Version 2.0 (the &quot;License&quot;);you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an &quot;AS IS&quot; BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License.--&gt;Ignite Interpreter for Apache ZeppelinOverviewApache Ignite In-Memory Data Fabric is a high-performance, integrated and distributed in-memory platform for computing and transacting on large-scale data sets in real-time, orders of magnitude faster than possible with traditional disk-based or flash technologies.You can use Zeppelin to retrieve distributed data from cache using Ignite SQL interpreter. Moreover, Ignite interpreter allo
 ws you to execute any Scala code in cases when SQL doesn&amp;#39;t fit to your requirements. For example, you can populate data into your caches or execute distributed computations.Installing and Running Ignite exampleIn order to use Ignite interpreters, you may install Apache Ignite in some simple steps:Ignite provides examples only with source or binary release. Download Ignite source release or binary release whatever you want. But you must download Ignite as the same version of Zeppelin&amp;#39;s. If it is not, you can&amp;#39;t use scala code on Zeppelin. The supported Ignite version is specified in Supported Interpreter table for each Zeppelin release. If you&amp;#39;re using Zeppelin master branch, please see ignite.version in path/to/your-Zeppelin/ignite/pom.xml.Examples are shipped as a separate Maven project, so to start running you simply need to import provided &amp;lt;dest_dir&amp;gt;/apache-ignite-fabric-{version}-bin/examples/pom.xml file into your favourite IDE, such
  as Eclipse.In case of Eclipse, Eclipse -&amp;gt; File -&amp;gt; Import -&amp;gt; Existing Maven ProjectsSet examples directory path to Eclipse and select the pom.xml.Then start org.apache.ignite.examples.ExampleNodeStartup (or whatever you want) to run at least one or more ignite node. When you run example code, you may notice that the number of node is increase one by one.Tip. If you want to run Ignite examples on the cli not IDE, you can export executable Jar file from IDE. Then run it by using below command.nohup java -jar &amp;lt;/path/to/your Jar file name&amp;gt;Configuring Ignite InterpreterAt the &amp;quot;Interpreters&amp;quot; menu, you may edit Ignite interpreter or create new one. Zeppelin provides these properties for Ignite.      Property Name    value    Description        ignite.addresses    127.0.0.1:47500..47509    Coma separated list of Ignite cluster hosts. See [Ignite Cluster Configuration](https://apacheignite.readme.io/docs/cluster-config) section for more de
 tails.        ignite.clientMode    true    You can connect to the Ignite cluster as client or server node. See [Ignite Clients vs. Servers](https://apacheignite.readme.io/docs/clients-vs-servers) section for details. Use true or false values in order to connect in client or server mode respectively.        ignite.config.url        Configuration URL. Overrides all other settings.        ignite.jdbc.url    jdbc:ignite:cfg://default-ignite-jdbc.xml    Ignite JDBC connection URL.        ignite.peerClassLoadingEnabled    true    Enables peer-class-loading. See [Zero Deployment](https://apacheignite.readme.io/docs/zero-deployment) section for details. Use true or false values in order to enable or disable P2P class loading respectively.  How to useAfter configuring Ignite interpreter, create your own notebook. Then you can bind interpreters like below image.For more interpreter binding information see here.Ignite SQL interpreterIn order to execute SQL query, use %ignite.ignitesql prefix. 
 Supposing you are running org.apache.ignite.examples.streaming.wordcount.StreamWords, then you can use &amp;quot;words&amp;quot; cache( Of course you have to specify this cache name to the Ignite interpreter setting section ignite.jdbc.url of Zeppelin ).For example, you can select top 10 words in the words cache using the following query%ignite.ignitesqlselect _val, count(_val) as cnt from String group by _val order by cnt desc limit 10As long as your Ignite version and Zeppelin Ignite version is same, you can also use scala code. Please check the Zeppelin Ignite version before you download your own Ignite.%igniteimport org.apache.ignite._import org.apache.ignite.cache.affinity._import org.apache.ignite.cache.query._import org.apache.ignite.configuration._import scala.collection.JavaConversions._val cache: IgniteCache[AffinityUuid, String] = ignite.cache(&amp;quot;words&amp;quot;)val qry = new SqlFieldsQuery(&amp;quot;select avg(cnt), min(cnt), max(cnt) from (select count(_val) as c
 nt from String group by _val)&amp;quot;, true)val res = cache.query(qry).getAll()collectionAsScalaIterable(res).foreach(println _)Apache Ignite also provides a guide docs for Zeppelin &amp;quot;Ignite with Apache Zeppelin&amp;quot;",
+      "content"  : "&lt;!--Licensed under the Apache License, Version 2.0 (the &quot;License&quot;);you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an &quot;AS IS&quot; BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License.--&gt;Ignite Interpreter for Apache ZeppelinOverviewApache Ignite In-Memory Data Fabric is a high-performance, integrated and distributed in-memory platform for computing and transacting on large-scale data sets in real-time, orders of magnitude faster than possible with traditional disk-based or flash technologies.You can use Zeppelin to retrieve distributed data from cache using Ignite SQL interpreter. Moreover, Ignite interpreter allo
 ws you to execute any Scala code in cases when SQL doesn&amp;#39;t fit to your requirements. For example, you can populate data into your caches or execute distributed computations.Installing and Running Ignite exampleIn order to use Ignite interpreters, you may install Apache Ignite in some simple steps:Ignite provides examples only with source or binary release. Download Ignite source release or binary release whatever you want. But you must download Ignite as the same version of Zeppelin&amp;#39;s. If it is not, you can&amp;#39;t use scala code on Zeppelin. The supported Ignite version is specified in Supported Interpreter table for each Zeppelin release. If you&amp;#39;re using Zeppelin master branch, please see ignite.version in path/to/your-Zeppelin/ignite/pom.xml.Examples are shipped as a separate Maven project, so to start running you simply need to import provided &amp;lt;dest_dir&amp;gt;/apache-ignite-fabric-{version}-bin/examples/pom.xml file into your favourite IDE, such
  as Eclipse.In case of Eclipse, Eclipse -&amp;gt; File -&amp;gt; Import -&amp;gt; Existing Maven ProjectsSet examples directory path to Eclipse and select the pom.xml.Then start org.apache.ignite.examples.ExampleNodeStartup (or whatever you want) to run at least one or more ignite node. When you run example code, you may notice that the number of node is increase one by one.Tip. If you want to run Ignite examples on the cli not IDE, you can export executable Jar file from IDE. Then run it by using below command.nohup java -jar &amp;lt;/path/to/your Jar file name&amp;gt;Configuring Ignite InterpreterAt the &amp;quot;Interpreters&amp;quot; menu, you may edit Ignite interpreter or create new one. Zeppelin provides these properties for Ignite.      Property Name    value    Description        ignite.addresses    127.0.0.1:47500..47509    Coma separated list of Ignite cluster hosts. See Ignite Cluster Configuration section for more details.        ignite.clientMode    true    You can con
 nect to the Ignite cluster as client or server node. See Ignite Clients vs. Servers section for details. Use true or false values in order to connect in client or server mode respectively.        ignite.config.url        Configuration URL. Overrides all other settings.        ignite.jdbc.url    jdbc:ignite:cfg://default-ignite-jdbc.xml    Ignite JDBC connection URL.        ignite.peerClassLoadingEnabled    true    Enables peer-class-loading. See Zero Deployment section for details. Use true or false values in order to enable or disable P2P class loading respectively.  How to useAfter configuring Ignite interpreter, create your own notebook. Then you can bind interpreters like below image.For more interpreter binding information see here.Ignite SQL interpreterIn order to execute SQL query, use %ignite.ignitesql prefix. Supposing you are running org.apache.ignite.examples.streaming.wordcount.StreamWords, then you can use &amp;quot;words&amp;quot; cache( Of course you have to specify t
 his cache name to the Ignite interpreter setting section ignite.jdbc.url of Zeppelin ).For example, you can select top 10 words in the words cache using the following query%ignite.ignitesqlselect _val, count(_val) as cnt from String group by _val order by cnt desc limit 10As long as your Ignite version and Zeppelin Ignite version is same, you can also use scala code. Please check the Zeppelin Ignite version before you download your own Ignite.%igniteimport org.apache.ignite._import org.apache.ignite.cache.affinity._import org.apache.ignite.cache.query._import org.apache.ignite.configuration._import scala.collection.JavaConversions._val cache: IgniteCache[AffinityUuid, String] = ignite.cache(&amp;quot;words&amp;quot;)val qry = new SqlFieldsQuery(&amp;quot;select avg(cnt), min(cnt), max(cnt) from (select count(_val) as cnt from String group by _val)&amp;quot;, true)val res = cache.query(qry).getAll()collectionAsScalaIterable(res).foreach(println _)Apache Ignite also provides a guide d
 ocs for Zeppelin &amp;quot;Ignite with Apache Zeppelin&amp;quot;",
       "url": " /interpreter/ignite.html",
       "group": "interpreter",
       "excerpt": "Apache Ignite in-memory Data Fabric is a high-performance, integrated and distributed in-memory platform for computing and transacting on large-scale data sets in real-time, orders of magnitude faster than possible with traditional disk-based or flash technologies."
@@ -201,7 +212,7 @@
 
     "/interpreter/flink.html": {
       "title": "Flink Interpreter for Apache Zeppelin",
-      "content"  : "&lt;!--Licensed under the Apache License, Version 2.0 (the &quot;License&quot;);you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an &quot;AS IS&quot; BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License.--&gt;Flink interpreter for Apache ZeppelinOverviewApache Flink is an open source platform for distributed stream and batch data processing. Flinkâs core is a streaming dataflow engine that provides data distribution, communication, and fault tolerance for distributed computations over data streams. Flink also builds batch processing on top of the streaming engine, overlaying native iteration support, managed memory, and program opt
 imization.How to start local Flink cluster, to test the interpreterZeppelin comes with pre-configured flink-local interpreter, which starts Flink in a local mode on your machine, so you do not need to install anything.How to configure interpreter to point to Flink clusterAt the &amp;quot;Interpreters&amp;quot; menu, you have to create a new Flink interpreter and provide next properties:      property    value    Description        host    local    host name of running JobManager. &#39;local&#39; runs flink in local mode (default)        port    6123    port of running JobManager  For more information about Flink configuration, you can find it here.How to test it&amp;#39;s workingYou can find an example of Flink usage in the Zeppelin Tutorial folder or try the following word count example, by using the Zeppelin notebook from Till Rohrmann&amp;#39;s presentation Interactive data analysis with Apache Flink for Apache Flink Meetup.%shrm 10.txt.utf-8wget http://www.gutenberg.org/ebooks/1
 0.txt.utf-8%flinkcase class WordCount(word: String, frequency: Int)val bible:DataSet[String] = benv.readTextFile(&amp;quot;10.txt.utf-8&amp;quot;)val partialCounts: DataSet[WordCount] = bible.flatMap{    line =&amp;gt;        &amp;quot;&amp;quot;&amp;quot;bw+b&amp;quot;&amp;quot;&amp;quot;.r.findAllIn(line).map(word =&amp;gt; WordCount(word, 1))//        line.split(&amp;quot; &amp;quot;).map(word =&amp;gt; WordCount(word, 1))}val wordCounts = partialCounts.groupBy(&amp;quot;word&amp;quot;).reduce{    (left, right) =&amp;gt; WordCount(left.word, left.frequency + right.frequency)}val result10 = wordCounts.first(10).collect()",
+      "content"  : "&lt;!--Licensed under the Apache License, Version 2.0 (the &quot;License&quot;);you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an &quot;AS IS&quot; BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License.--&gt;Flink interpreter for Apache ZeppelinOverviewApache Flink is an open source platform for distributed stream and batch data processing. Flinkâs core is a streaming dataflow engine that provides data distribution, communication, and fault tolerance for distributed computations over data streams. Flink also builds batch processing on top of the streaming engine, overlaying native iteration support, managed memory, and program opt
 imization.Apache Flink is supported in Zeppelin with Flink interpreter group which consists of below five interpreters.      Name    Class    Description        %flink    FlinkInterpreter    Creates ExecutionEnvironment/StreamExecutionEnvironment/BatchTableEnvironment/StreamTableEnvironment and provides a Scala environment        %flink.pyflink    PyFlinkInterpreter    Provides a python environment        %flink.ipyflink    IPyFlinkInterpreter    Provides an ipython environment        %flink.ssql    FlinkStreamSqlInterpreter    Provides a stream sql environment        %flink.bsql    FlinkBatchSqlInterpreter    Provides a batch sql environment  ConfigurationThe Flink interpreter can be configured with properties provided by Zeppelin.You can also set other flink properties which are not listed in the table. For a list of additional properties, refer to Flink Available Properties.      Property    Default    Description        FLINK_HOME        Location of flink installation. It is mus
 t be specified, otherwise you can not use flink in zeppelin        flink.execution.mode    local    Execution mode of flink, e.g. local/yarn/remote        flink.execution.remote.host        jobmanager hostname if it is remote mode        flink.execution.remote.port        jobmanager port if it is remote mode        flink.jm.memory    1024    Total number of memory(mb) of JobManager        flink.tm.memory    1024    Total number of memory(mb) of TaskManager        flink.tm.num    2    Number of TaskManager        flink.tm.slot    1    Number of slot per TaskManager        flink.yarn.appName    Zeppelin Flink Session    Yarn app name        flink.yarn.queue        queue name of yarn app        flink.yarn.jars        additional user jars (comma separated)        zeppelin.flink.scala.color    true    whether display scala shell output in colorful format        zeppelin.flink.enableHive    false    whether enable hive        zeppelin.flink.printREPLOutput    true    Print REPL output    
     zeppelin.flink.maxResult    1000    max number of row returned by sql interpreter        zeppelin.flink.planner    blink    planner or flink table api, blink or flink        zeppelin.pyflink.python    python    python executable for pyflink  StreamExecutionEnvironment, ExecutionEnvironment, StreamTableEnvironment, BatchTableEnvironmentZeppelin will create 4 variables to represent flink&amp;#39;s entrypoint:* senv    (StreamExecutionEnvironment), * env     (ExecutionEnvironment)* stenv   (StreamTableEnvironment) * btenv   (BatchTableEnvironment)ZeppelinContextZeppelin automatically injects ZeppelinContext as variable z in your Scala/Python environment. ZeppelinContext provides some additional functions and utilities.See Zeppelin-Context for more details.IPython supportBy default, zeppelin would use IPython in pyflink when IPython is available, Otherwise it would fall back to the original PyFlink implementation.If you don&amp;#39;t want to use IPython, then you can set zeppelin.py
 flink.useIPython as false in interpreter setting. For the IPython features, you can refer docPython Interpreter",
       "url": " /interpreter/flink.html",
       "group": "interpreter",
       "excerpt": "Apache Flink is an open source platform for distributed stream and batch data processing."
@@ -223,7 +234,7 @@
 
     "/interpreter/cassandra.html": {
       "title": "Cassandra CQL Interpreter for Apache Zeppelin",

[... 159 lines stripped ...]