You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@zeppelin.apache.org by Abhi Basu <90...@gmail.com> on 2016/07/29 16:22:37 UTC

Configure Zeppelin to connect to remote Hadoop

In the past I have used Zeppelin on an edge node of CDH cluster. I am
trying to figure out how to connect Zeppelin running on a CentOS node to a
remote hadoop cluster to be able to use Spark, Hive/Impala.

Thanks,

Abhi

-- 
Abhi Basu

Re: Configure Zeppelin to connect to remote Hadoop

Posted by moon soo Lee <mo...@apache.org>.
Hi,

Error around com.fastxml.jackson.databind comes when there two different
version of jackson library in your classpath.
Please check [1] and make sure your classpath has only one version of
jackson library.

Hope this helps.

Best,
moon

[1]
http://apache-zeppelin-users-incubating-mailing-list.75479.x6.nabble.com/com-fasterxml-jackson-databind-JsonMappingException-td1607.html


On Sat, Jul 30, 2016 at 2:01 AM Cobo Rodriguez Roberto <ro...@produban.com>
wrote:

> Hi there:
>
>
>
> This is tremendous interesting for me as well.
>
> I started with spark in local mode , assuming it will be a complete
> disaster regarding performance.
>
> I’ve been able to reach hdfs files hosted in a remote data lake from de
> spark-shell.
>
>
>
> My next step is that you are describing.  The idea is that workers must be
> as close as possible to data nodes because the best network is that doesn’t
> exists ( lets them to be in the very same node to reduce communication
> costs)
>
>
>
> After that, It would be nice form me that zeppelin will allow me no to
> pain with this error that I’m still dealing with:
>
>
>
> %spark
>
> val data = Array(1, 2, 3, 4, 5)
>
> val path2file="/home/zeppelin/Downloads/bank.csv"
>
> val cotizFile = sc.textFile(path2file,2)
>
>
>
> data: Array[Int] = Array(1, 2, 3, 4, 5)
>
> path2file: String = /home/zeppelin/Downloads/bank.csv
>
> com.fasterxml.jackson.databind.JsonMappingException: Could not find
> creator property with name 'id' (in class
> org.apache.spark.rdd.RDDOperationScope)
>
> at [Source: {"id":"24","name":"textFile"}; line: 1, column: 1]
>
> at
> com.fasterxml.jackson.databind.JsonMappingException.from(JsonMappingException.java:148)
>
> at
> com.fasterxml.jackson.databind.DeserializationContext.mappingException(DeserializationContext.java:843)
>
>
>
> So, please follow on your path. Perhaps configuring correctly my remote
> connection, I will be able to avoid my “stopper”
>
>
>
> Thanks for your inputs. I’ll be looking forward your next messages
>
>
>
>
>
> Saludos cordiales / Kind regards
>
> *------------------------------------------------------*
>
> *Roberto Cobo Rodríguez*
>
> *Mail*: *rocobo@produban.com <ro...@produban.com>*
>
>
>
>
>
> *De:* Abhi Basu [mailto:9000revs@gmail.com]
> *Enviado el:* viernes, 29 de julio de 2016 18:47
> *Para:* users@zeppelin.apache.org
> *Asunto:* Re: Configure Zeppelin to connect to remote Hadoop
>
>
>
> So, I have to use the IP address of the remote hadoop node?
>
>
>
> How about the hive configs?
>
>
>
> On Fri, Jul 29, 2016 at 9:44 AM, Joaquin Alzola <Jo...@lebara.com>
> wrote:
>
> >I am trying to figure out how to connect Zeppelin running on a CentOS
> node to a remote hadoop cluster to be able to use Spark, Hive/Impala.
>
> You mean this?
>
> master spark://master:7077
>
>
>
> Then spark will connect to hdfs
>
> This email is confidential and may be subject to privilege. If you are not
> the intended recipient, please do not copy or disclose its content but
> contact the sender immediately upon receipt.
>
>
>
>
>
> --
>
> Abhi Basu
>
> ------------------------------
>
> Antes de imprimir este mensaje o sus documentos anexos, asegúrese de que
> es necesario.
> Proteger el medio ambiente está en nuestras manos.
>
> Before printing this e-mail or attachments, be sure it is necessary.
> It is in our hands to protect the environment.
>
> ******************AVISO LEGAL**********************
>
> Este mensaje es privado y confidencial y solamente para la persona a la
> que va dirigido. Si usted ha recibido este mensaje por error, no debe
> revelar, copiar, distribuir o usarlo en ningún sentido. Le rogamos lo
> comunique al remitente y borre dicho mensaje y cualquier documento adjunto
> que pudiera contener. No hay renuncia a la confidencialidad ni a ningún
> privilegio por causa de transmisión errónea o mal funcionamiento.
>
> Cualquier opinión expresada en este mensaje pertenece únicamente al autor
> remitente, y no representa necesariamente la opinión de Grupo Santander, a
> no ser que expresamente se diga y el remitente esté autorizado para
> hacerlo. Los correos electrónicos no son seguros, no garantizan la
> confidencialidad ni la correcta recepción de los mismos, dado que pueden
> ser interceptados, manipulados, destruidos, llegar con demora, incompletos,
> o con virus. Grupo Santander no se hace responsable de las alteraciones que
> pudieran hacerse al mensaje una vez enviado.
>
> Este mensaje sólo tiene una finalidad de información, y no debe
> interpretarse como una oferta de venta o de compra de valores ni de
> instrumentos financieros relacionados. En el caso de que el destinatario de
> este mensaje no consintiera la utilización del correo electrónico vía
> Internet, rogamos lo ponga en nuestro conocimiento.
>
>
>
> **********************DISCLAIMER*****************
>
> This message is private and confidential and it is intended exclusively
> for the addressee. If you receive this message by mistake, you should not
> disseminate, distribute or copy this e-mail. Please inform the sender and
> delete the message and attachments from your system. No confidentiality nor
> any privilege regarding the information is waived or lost by any
> mistransmission or malfunction.
>
> Any views or opinions contained in this message are solely those of the
> author, and do not necessarily represent those of Grupo Santander, unless
> otherwise specifically stated and the sender is authorized to do so. E-mail
> transmission cannot be guaranteed to be secure, confidential, or
> error-free, as information could be intercepted, corrupted, lost,
> destroyed, arrive late, incomplete, or contain viruses. Grupo Santander
> does not accept responsibility for any changes in the contents of this
> message after it has been sent.
>
> This message is provided for informational purposes and should not be
> construed as a solicitation or offer to buy or sell any securities or
> related financial instruments. If the addressee of this message does not
> consent to the use of internet e-mail, please communicate it to us.
>
>
>
>
>

RE: Configure Zeppelin to connect to remote Hadoop

Posted by Cobo Rodriguez Roberto <ro...@produban.com>.
Hi there:

This is tremendous interesting for me as well.
I started with spark in local mode , assuming it will be a complete disaster regarding performance.
I’ve been able to reach hdfs files hosted in a remote data lake from de spark-shell.

My next step is that you are describing.  The idea is that workers must be as close as possible to data nodes because the best network is that doesn’t exists ( lets them to be in the very same node to reduce communication costs)

After that, It would be nice form me that zeppelin will allow me no to pain with this error that I’m still dealing with:

%spark
val data = Array(1, 2, 3, 4, 5)
val path2file="/home/zeppelin/Downloads/bank.csv"
val cotizFile = sc.textFile(path2file,2)

data: Array[Int] = Array(1, 2, 3, 4, 5)
path2file: String = /home/zeppelin/Downloads/bank.csv
com.fasterxml.jackson.databind.JsonMappingException: Could not find creator property with name 'id' (in class org.apache.spark.rdd.RDDOperationScope)
at [Source: {"id":"24","name":"textFile"}; line: 1, column: 1]
at com.fasterxml.jackson.databind.JsonMappingException.from(JsonMappingException.java:148)
at com.fasterxml.jackson.databind.DeserializationContext.mappingException(DeserializationContext.java:843)

So, please follow on your path. Perhaps configuring correctly my remote connection, I will be able to avoid my “stopper”

Thanks for your inputs. I’ll be looking forward your next messages


Saludos cordiales / Kind regards

------------------------------------------------------
Roberto Cobo Rodríguez
Mail: rocobo@produban.com


De: Abhi Basu [mailto:9000revs@gmail.com]
Enviado el: viernes, 29 de julio de 2016 18:47
Para: users@zeppelin.apache.org
Asunto: Re: Configure Zeppelin to connect to remote Hadoop

So, I have to use the IP address of the remote hadoop node?

How about the hive configs?

On Fri, Jul 29, 2016 at 9:44 AM, Joaquin Alzola <Jo...@lebara.com>> wrote:
>I am trying to figure out how to connect Zeppelin running on a CentOS node to a remote hadoop cluster to be able to use Spark, Hive/Impala.
You mean this?
master spark://master:7077

Then spark will connect to hdfs
This email is confidential and may be subject to privilege. If you are not the intended recipient, please do not copy or disclose its content but contact the sender immediately upon receipt.



--
Abhi Basu

________________________________
Antes de imprimir este mensaje o sus documentos anexos, asegúrese de que es necesario.
Proteger el medio ambiente está en nuestras manos.

Before printing this e-mail or attachments, be sure it is necessary.
It is in our hands to protect the environment.

******************AVISO LEGAL**********************
Este mensaje es privado y confidencial y solamente para la persona a la que va dirigido. Si usted ha recibido este mensaje por error, no debe revelar, copiar, distribuir o usarlo en ningún sentido. Le rogamos lo comunique al remitente y borre dicho mensaje y cualquier documento adjunto que pudiera contener. No hay renuncia a la confidencialidad ni a ningún privilegio por causa de transmisión errónea o mal funcionamiento.
Cualquier opinión expresada en este mensaje pertenece únicamente al autor remitente, y no representa necesariamente la opinión de Grupo Santander, a no ser que expresamente se diga y el remitente esté autorizado para hacerlo. Los correos electrónicos no son seguros, no garantizan la confidencialidad ni la correcta recepción de los mismos, dado que pueden ser interceptados, manipulados, destruidos, llegar con demora, incompletos, o con virus. Grupo Santander no se hace responsable de las alteraciones que pudieran hacerse al mensaje una vez enviado.
Este mensaje sólo tiene una finalidad de información, y no debe interpretarse como una oferta de venta o de compra de valores ni de instrumentos financieros relacionados. En el caso de que el destinatario de este mensaje no consintiera la utilización del correo electrónico vía Internet, rogamos lo ponga en nuestro conocimiento.


**********************DISCLAIMER*****************
This message is private and confidential and it is intended exclusively for the addressee. If you receive this message by mistake, you should not disseminate, distribute or copy this e-mail. Please inform the sender and delete the message and attachments from your system. No confidentiality nor any privilege regarding the information is waived or lost by any mistransmission or malfunction.
Any views or opinions contained in this message are solely those of the author, and do not necessarily represent those of Grupo Santander, unless otherwise specifically stated and the sender is authorized to do so. E-mail transmission cannot be guaranteed to be secure, confidential, or error-free, as information could be intercepted, corrupted, lost, destroyed, arrive late, incomplete, or contain viruses. Grupo Santander does not accept responsibility for any changes in the contents of this message after it has been sent.
This message is provided for informational purposes and should not be construed as a solicitation or offer to buy or sell any securities or related financial instruments. If the addressee of this message does not consent to the use of internet e-mail, please communicate it to us.



Re: Configure Zeppelin to connect to remote Hadoop

Posted by Abhi Basu <90...@gmail.com>.
So, I have to use the IP address of the remote hadoop node?

How about the hive configs?

On Fri, Jul 29, 2016 at 9:44 AM, Joaquin Alzola <Jo...@lebara.com>
wrote:

> >I am trying to figure out how to connect Zeppelin running on a CentOS
> node to a remote hadoop cluster to be able to use Spark, Hive/Impala.
>
> You mean this?
>
> master spark://master:7077
>
>
>
> Then spark will connect to hdfs
> This email is confidential and may be subject to privilege. If you are not
> the intended recipient, please do not copy or disclose its content but
> contact the sender immediately upon receipt.
>



-- 
Abhi Basu

RE: Configure Zeppelin to connect to remote Hadoop

Posted by Joaquin Alzola <Jo...@lebara.com>.
>I am trying to figure out how to connect Zeppelin running on a CentOS node to a remote hadoop cluster to be able to use Spark, Hive/Impala.
You mean this?
master spark://master:7077

Then spark will connect to hdfs
This email is confidential and may be subject to privilege. If you are not the intended recipient, please do not copy or disclose its content but contact the sender immediately upon receipt.