You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@zeppelin.apache.org by York Huang <yo...@gmail.com> on 2016/09/03 08:45:11 UTC
Zeppelin architecture
Hi,
I am new to Zeppelin and have a few questions.
1. Should I install Zeppelin on a Hadoop edge node and every users access from browser? Or should every users have to install their own Zeppelin ?
2. How do I run standard Python without using spark?
3. Can I install Zeppelin on Windows server?
4. Is it possible to share data between interpreters ?
Thanks
York
Sent from my iPhone
Re: Zeppelin architecture
Posted by York Huang <yo...@gmail.com>.
Hi Moon,
Sorry, a few more questions.
My cluster is a mapr cluster.
If I want to install zeppelin on one edge node and multiple users access
that zeppelin, how do I set up multiple users to run jobs and access data
in MapR cluster using their own accounts?
If I want to install zeppelin on every users' desktop and let them to
access MapR from their own desktops, how do I install zeppelin on their
windows desktops?
Is there any guide somewhere?
Thanks,
York
On 7 September 2016 at 10:06, York Huang <yo...@gmail.com> wrote:
> Hi Moon,
>
> More questions.
>
> If I set up the MapR cluster in secure mode, how do I set up zeppelin?
>
> Thanks,
>
> York
>
> On 6 September 2016 at 17:16, York Huang <yo...@gmail.com> wrote:
>
>> Hi Moon,
>>
>> Thanks for your response.
>>
>> I have a MapR 4.1 cluster and would like to use zeppelin on it. If I
>> install zeppelin on an edge node, what security should I set up? The online
>> document is a bit confusing. Basically, I want to set up every users have
>> their own account (either AD or newly created zeppelin account).
>>
>> Is there any guide?
>>
>> Thanks,
>>
>> York
>>
>> On 5 September 2016 at 07:31, moon soo Lee <mo...@apache.org> wrote:
>>
>>> Hi York,
>>>
>>> Thanks for the question.
>>>
>>> 1. How you install zeppelin is up to you and your use case. You can
>>> either run single instances of Zeppelin and configure authentication and
>>> let many user login, or let each user run their own Zeppelin instance.
>>> I see both use cases from users, and it really depends on your
>>> environment.
>>>
>>> 2. From 0.6.0 release, Zeppelin ships python interpreter. You can try
>>> %python.
>>>
>>> 3. You can run Zeppelin on windows by running bin/zeppelin.cmd
>>>
>>> 4. Interpreter can share data through resource pool. You can think
>>> resource pool as a distributed map across all interpreters. Although every
>>> interpreter can access the resource pool, few interpreters expose API to
>>> user and let user directly access the resource pool.
>>>
>>> SparkInterpreter, PysparkInterpreter, SparkRInterpreter are interpreters
>>> that expose resource pool API to user. You can access resource pool via
>>> z.get(), z.put() api. Check [1].
>>>
>>>
>>> Thanks,
>>> moon
>>>
>>> [1] http://zeppelin.apache.org/docs/latest/interpreter/spark
>>> .html#object-exchange
>>>
>>> On Sat, Sep 3, 2016 at 6:45 PM York Huang <yo...@gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I am new to Zeppelin and have a few questions.
>>>> 1. Should I install Zeppelin on a Hadoop edge node and every users
>>>> access from browser? Or should every users have to install their own
>>>> Zeppelin ?
>>>>
>>>> 2. How do I run standard Python without using spark?
>>>>
>>>> 3. Can I install Zeppelin on Windows server?
>>>>
>>>> 4. Is it possible to share data between interpreters ?
>>>>
>>>> Thanks
>>>>
>>>> York
>>>>
>>>> Sent from my iPhone
>>>
>>>
>>
>
Re: Zeppelin architecture
Posted by York Huang <yo...@gmail.com>.
Hi Moon,
More questions.
If I set up the MapR cluster in secure mode, how do I set up zeppelin?
Thanks,
York
On 6 September 2016 at 17:16, York Huang <yo...@gmail.com> wrote:
> Hi Moon,
>
> Thanks for your response.
>
> I have a MapR 4.1 cluster and would like to use zeppelin on it. If I
> install zeppelin on an edge node, what security should I set up? The online
> document is a bit confusing. Basically, I want to set up every users have
> their own account (either AD or newly created zeppelin account).
>
> Is there any guide?
>
> Thanks,
>
> York
>
> On 5 September 2016 at 07:31, moon soo Lee <mo...@apache.org> wrote:
>
>> Hi York,
>>
>> Thanks for the question.
>>
>> 1. How you install zeppelin is up to you and your use case. You can
>> either run single instances of Zeppelin and configure authentication and
>> let many user login, or let each user run their own Zeppelin instance.
>> I see both use cases from users, and it really depends on your
>> environment.
>>
>> 2. From 0.6.0 release, Zeppelin ships python interpreter. You can try
>> %python.
>>
>> 3. You can run Zeppelin on windows by running bin/zeppelin.cmd
>>
>> 4. Interpreter can share data through resource pool. You can think
>> resource pool as a distributed map across all interpreters. Although every
>> interpreter can access the resource pool, few interpreters expose API to
>> user and let user directly access the resource pool.
>>
>> SparkInterpreter, PysparkInterpreter, SparkRInterpreter are interpreters
>> that expose resource pool API to user. You can access resource pool via
>> z.get(), z.put() api. Check [1].
>>
>>
>> Thanks,
>> moon
>>
>> [1] http://zeppelin.apache.org/docs/latest/interpreter/spark
>> .html#object-exchange
>>
>> On Sat, Sep 3, 2016 at 6:45 PM York Huang <yo...@gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> I am new to Zeppelin and have a few questions.
>>> 1. Should I install Zeppelin on a Hadoop edge node and every users
>>> access from browser? Or should every users have to install their own
>>> Zeppelin ?
>>>
>>> 2. How do I run standard Python without using spark?
>>>
>>> 3. Can I install Zeppelin on Windows server?
>>>
>>> 4. Is it possible to share data between interpreters ?
>>>
>>> Thanks
>>>
>>> York
>>>
>>> Sent from my iPhone
>>
>>
>
Re: Running R on Zeppelin EMR Cluster
Posted by Flayranalytics <ma...@flayranalytics.co.uk>.
Thanks for sharing.
That is disappointing that R is not available on EMR. I will look out for updates.
Regards,
Mark
> On 6 Sep 2016, at 17:42, Jonathan Kelly <jo...@gmail.com> wrote:
>
> Mark,
>
> I see in the couchbase-spark-connector Github project that they have already upgraded to Spark 2.0 (https://github.com/couchbase/couchbase-spark-connector/pull/9) but that this change has not yet been released into a new version. According to the discussion on that pull request, it sounds like they are hoping for a new version this month.
>
> As for using the R interpreter on emr-5.0.0, unfortunately EMR does not yet (officially) support the R interpreter. I expect that we (I'm from EMR, btw) would be able to support it eventually, but I'm unable to give any ETA on that.
>
> ~ Jonathan
>
>> On Tue, Sep 6, 2016 at 8:34 AM Mark Mikolajczak - 07855 306 064 <ma...@flayranalytics.co.uk> wrote:
>> Thanks I was afraid that was the solution.
>>
>> I am connecting to a Couchbase database and the connector only supports Spark 1.6 and by upgrading I will be using EMR-5.0.0 which seems to only run Spark 2.0…
>>
>>
>>> On 6 Sep 2016, at 16:27, Hyung Sung Shim <hs...@nflabs.com> wrote:
>>>
>>> and EMR-5.0.0 supports Zeppelin 0.6.1.
>>>
>>>
>>> 2016-09-07 0:24 GMT+09:00 Hyung Sung Shim <hs...@nflabs.com>:
>>>> Hi.
>>>> Unfortunately Zeppelin 0.5.6 does not support R interpreter.
>>>> Could you upgrade your Zeppelin to higher version?
>>>>
>>>> 2016-09-06 23:53 GMT+09:00 Mark Mikolajczak - 07855 306 064 <ma...@flayranalytics.co.uk>:
>>>>> Hi All,
>>>>>
>>>>> I am trying to setup the R interpreter to run in Zeppelin which is currently running on EMR. Zeppelin is working perfectly and I am able to write script in Scala and Python. When I use %r, %sparkR or %knitr I receive an error : "r interpreter not found"
>>>>>
>>>>> The applications which I have running in my emr-4.7.2 cluster are: Hive 1.0.0, Zeppelin-Sandbox 0.5.6, Spark 1.6.2, Pig 0.14.0
>>>>>
>>>>> Within the interpreter there is no mention of R so figure I am missing something but do not know what.
>>>>>
>>>>> Any pointers greatly appreciated.
Re: Running R on Zeppelin EMR Cluster
Posted by Jonathan Kelly <jo...@gmail.com>.
Mark,
I see in the couchbase-spark-connector Github project that they have
already upgraded to Spark 2.0 (
https://github.com/couchbase/couchbase-spark-connector/pull/9) but that
this change has not yet been released into a new version. According to the
discussion on that pull request, it sounds like they are hoping for a new
version this month.
As for using the R interpreter on emr-5.0.0, unfortunately EMR does not yet
(officially) support the R interpreter. I expect that we (I'm from EMR,
btw) would be able to support it eventually, but I'm unable to give any ETA
on that.
~ Jonathan
On Tue, Sep 6, 2016 at 8:34 AM Mark Mikolajczak - 07855 306 064 <
mark@flayranalytics.co.uk> wrote:
> Thanks I was afraid that was the solution.
>
> I am connecting to a Couchbase database and the connector only supports
> Spark 1.6 and by upgrading I will be using EMR-5.0.0 which seems to only
> run Spark 2.0…
>
>
> On 6 Sep 2016, at 16:27, Hyung Sung Shim <hs...@nflabs.com> wrote:
>
> and EMR-5.0.0 supports Zeppelin 0.6.1.
>
>
> 2016-09-07 0:24 GMT+09:00 Hyung Sung Shim <hs...@nflabs.com>:
>
>> Hi.
>> Unfortunately Zeppelin 0.5.6 does not support R interpreter.
>> Could you upgrade your Zeppelin to higher version?
>>
>> 2016-09-06 23:53 GMT+09:00 Mark Mikolajczak - 07855 306 064 <
>> mark@flayranalytics.co.uk>:
>>
>>> Hi All,
>>>
>>> I am trying to setup the R interpreter to run in Zeppelin which is
>>> currently running on EMR. Zeppelin is working perfectly and I am able to
>>> write script in Scala and Python. When I use %r, %sparkR or %knitr I
>>> receive an error : "r interpreter not found"
>>>
>>> The applications which I have running in my emr-4.7.2 cluster are: Hive
>>> 1.0.0, Zeppelin-Sandbox 0.5.6, Spark 1.6.2, Pig 0.14.0
>>>
>>> Within the interpreter there is no mention of R so figure I am missing
>>> something but do not know what.
>>>
>>> Any pointers greatly appreciated.
>>>
>>
>>
>
>
Re: Running R on Zeppelin EMR Cluster
Posted by Mark Mikolajczak - 07855 306 064 <ma...@flayranalytics.co.uk>.
Thanks I was afraid that was the solution.
I am connecting to a Couchbase database and the connector only supports Spark 1.6 and by upgrading I will be using EMR-5.0.0 which seems to only run Spark 2.0…
> On 6 Sep 2016, at 16:27, Hyung Sung Shim <hs...@nflabs.com> wrote:
>
> and EMR-5.0.0 supports Zeppelin 0.6.1.
>
>
> 2016-09-07 0:24 GMT+09:00 Hyung Sung Shim <hsshim@nflabs.com <ma...@nflabs.com>>:
> Hi.
> Unfortunately Zeppelin 0.5.6 does not support R interpreter.
> Could you upgrade your Zeppelin to higher version?
>
> 2016-09-06 23:53 GMT+09:00 Mark Mikolajczak - 07855 306 064 <mark@flayranalytics.co.uk <ma...@flayranalytics.co.uk>>:
> Hi All,
>
> I am trying to setup the R interpreter to run in Zeppelin which is currently running on EMR. Zeppelin is working perfectly and I am able to write script in Scala and Python. When I use %r, %sparkR or %knitr I receive an error : "r interpreter not found"
>
> The applications which I have running in my emr-4.7.2 cluster are: Hive 1.0.0, Zeppelin-Sandbox 0.5.6, Spark 1.6.2, Pig 0.14.0
>
> Within the interpreter there is no mention of R so figure I am missing something but do not know what.
>
> Any pointers greatly appreciated.
>
>
>
Re: Running R on Zeppelin EMR Cluster
Posted by Hyung Sung Shim <hs...@nflabs.com>.
and EMR-5.0.0 supports Zeppelin 0.6.1.
2016-09-07 0:24 GMT+09:00 Hyung Sung Shim <hs...@nflabs.com>:
> Hi.
> Unfortunately Zeppelin 0.5.6 does not support R interpreter.
> Could you upgrade your Zeppelin to higher version?
>
> 2016-09-06 23:53 GMT+09:00 Mark Mikolajczak - 07855 306 064 <
> mark@flayranalytics.co.uk>:
>
>> Hi All,
>>
>> I am trying to setup the R interpreter to run in Zeppelin which is
>> currently running on EMR. Zeppelin is working perfectly and I am able to
>> write script in Scala and Python. When I use %r, %sparkR or %knitr I
>> receive an error : "r interpreter not found"
>>
>> The applications which I have running in my emr-4.7.2 cluster are: Hive
>> 1.0.0, Zeppelin-Sandbox 0.5.6, Spark 1.6.2, Pig 0.14.0
>>
>> Within the interpreter there is no mention of R so figure I am missing
>> something but do not know what.
>>
>> Any pointers greatly appreciated.
>>
>
>
Re: Running R on Zeppelin EMR Cluster
Posted by Hyung Sung Shim <hs...@nflabs.com>.
Hi.
Unfortunately Zeppelin 0.5.6 does not support R interpreter.
Could you upgrade your Zeppelin to higher version?
2016-09-06 23:53 GMT+09:00 Mark Mikolajczak - 07855 306 064 <
mark@flayranalytics.co.uk>:
> Hi All,
>
> I am trying to setup the R interpreter to run in Zeppelin which is
> currently running on EMR. Zeppelin is working perfectly and I am able to
> write script in Scala and Python. When I use %r, %sparkR or %knitr I
> receive an error : "r interpreter not found"
>
> The applications which I have running in my emr-4.7.2 cluster are: Hive
> 1.0.0, Zeppelin-Sandbox 0.5.6, Spark 1.6.2, Pig 0.14.0
>
> Within the interpreter there is no mention of R so figure I am missing
> something but do not know what.
>
> Any pointers greatly appreciated.
>
Running R on Zeppelin EMR Cluster
Posted by Mark Mikolajczak - 07855 306 064 <ma...@flayranalytics.co.uk>.
Hi All,
I am trying to setup the R interpreter to run in Zeppelin which is currently running on EMR. Zeppelin is working perfectly and I am able to write script in Scala and Python. When I use %r, %sparkR or %knitr I receive an error : "r interpreter not found"
The applications which I have running in my emr-4.7.2 cluster are: Hive 1.0.0, Zeppelin-Sandbox 0.5.6, Spark 1.6.2, Pig 0.14.0
Within the interpreter there is no mention of R so figure I am missing something but do not know what.
Any pointers greatly appreciated.
Re: Zeppelin architecture
Posted by York Huang <yo...@gmail.com>.
Hi Moon,
Thanks for your response.
I have a MapR 4.1 cluster and would like to use zeppelin on it. If I
install zeppelin on an edge node, what security should I set up? The online
document is a bit confusing. Basically, I want to set up every users have
their own account (either AD or newly created zeppelin account).
Is there any guide?
Thanks,
York
On 5 September 2016 at 07:31, moon soo Lee <mo...@apache.org> wrote:
> Hi York,
>
> Thanks for the question.
>
> 1. How you install zeppelin is up to you and your use case. You can either
> run single instances of Zeppelin and configure authentication and let many
> user login, or let each user run their own Zeppelin instance.
> I see both use cases from users, and it really depends on your environment.
>
> 2. From 0.6.0 release, Zeppelin ships python interpreter. You can try
> %python.
>
> 3. You can run Zeppelin on windows by running bin/zeppelin.cmd
>
> 4. Interpreter can share data through resource pool. You can think
> resource pool as a distributed map across all interpreters. Although every
> interpreter can access the resource pool, few interpreters expose API to
> user and let user directly access the resource pool.
>
> SparkInterpreter, PysparkInterpreter, SparkRInterpreter are interpreters
> that expose resource pool API to user. You can access resource pool via
> z.get(), z.put() api. Check [1].
>
>
> Thanks,
> moon
>
> [1] http://zeppelin.apache.org/docs/latest/interpreter/
> spark.html#object-exchange
>
> On Sat, Sep 3, 2016 at 6:45 PM York Huang <yo...@gmail.com>
> wrote:
>
>> Hi,
>>
>> I am new to Zeppelin and have a few questions.
>> 1. Should I install Zeppelin on a Hadoop edge node and every users access
>> from browser? Or should every users have to install their own Zeppelin ?
>>
>> 2. How do I run standard Python without using spark?
>>
>> 3. Can I install Zeppelin on Windows server?
>>
>> 4. Is it possible to share data between interpreters ?
>>
>> Thanks
>>
>> York
>>
>> Sent from my iPhone
>
>
Re: Zeppelin architecture
Posted by moon soo Lee <mo...@apache.org>.
Hi York,
Thanks for the question.
1. How you install zeppelin is up to you and your use case. You can either
run single instances of Zeppelin and configure authentication and let many
user login, or let each user run their own Zeppelin instance.
I see both use cases from users, and it really depends on your environment.
2. From 0.6.0 release, Zeppelin ships python interpreter. You can try
%python.
3. You can run Zeppelin on windows by running bin/zeppelin.cmd
4. Interpreter can share data through resource pool. You can think resource
pool as a distributed map across all interpreters. Although every
interpreter can access the resource pool, few interpreters expose API to
user and let user directly access the resource pool.
SparkInterpreter, PysparkInterpreter, SparkRInterpreter are interpreters
that expose resource pool API to user. You can access resource pool via
z.get(), z.put() api. Check [1].
Thanks,
moon
[1]
http://zeppelin.apache.org/docs/latest/interpreter/spark.html#object-exchange
On Sat, Sep 3, 2016 at 6:45 PM York Huang <yo...@gmail.com> wrote:
> Hi,
>
> I am new to Zeppelin and have a few questions.
> 1. Should I install Zeppelin on a Hadoop edge node and every users access
> from browser? Or should every users have to install their own Zeppelin ?
>
> 2. How do I run standard Python without using spark?
>
> 3. Can I install Zeppelin on Windows server?
>
> 4. Is it possible to share data between interpreters ?
>
> Thanks
>
> York
>
> Sent from my iPhone