You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zeppelin.apache.org by Pradeep Reddy <pr...@gmail.com> on 2016/09/07 14:14:20 UTC
Incosistent spark local mode on a secured Cloudera cluster

I had a thread regarding this in the user group and I was able to
workaround on 0.6.1 build by setting SPARK_HOME. As there is no workaround
in latest snapshot build, I'm sending this to the dev group. You may need
to have a secure(kerberos) cluster behind a proxy to reproduce these
inconsistencies/issues with spark interpreter.

*0.7 latest Snapshot:* When I git clone & build the 0.7 snapshot and run
spark interpreter in local mode after copying hive_site.xml to conf
directory with just one HADOOP_CONF_DIR property set in zeppelin-env.sh,
I'm able to run a spark paragraph to test the interpreter, but it doesn't
talk to my cluster, it runs in a standalone mode and shows only "default"
database when executing *z.show(sql("show databases"))*. *Now when I set
the SPARK_HOME variable in zeppelin-env.sh and restart, I get the stack
trace cascaded below.*

*0.6.1 build: *When I download the 0.6.1 source and run spark interpreter
in local mode after copying hive_site.xml to conf directory  with just one
HADOOP_CONF_DIR property set in zeppelin-env.sh, I'm able to run a
spark paragraph to test the interpreter, but it doesn't talk to my cluster,
it runs in a standalone mode and shows only "default" database when
executing *z.show(sql("show databases"))*. *Now when I set the SPARK_HOME
variable in zeppelin-env.sh and restart, the spark interpreter is able to
talk to my cluster and show all my databases in the cluster*

*0.5.6 build *When I download the 0.5.6 source and run spark interpreter in
local mode after copying hive_site.xml to conf directory with just one
HADOOP_CONF_DIR property set in zeppelin-env.sh, I'm able to run a
spark paragraph to test the interpreter, *and it talks to my cluster
without even setting "SPARK_HOME" environment variable where I can see all
my databases when executing **z.show(sql("show databases"))*

started by scheduler org.apache.zeppelin.spark.SparkInterpreter335845091
ERROR [2016-08-30 17:45:37,237] ({pool-2-thread-2} Job.java[run]:189) - Job
failed
java.lang.IllegalArgumentException: Invalid rule: L
RULE:[2:$1@$0](.*@\Q<DOMAIN1>.COM\E$)s/@\Q<DOMAIN1>\E$//L
RULE:[1:$1@$0](.*@\Q<DOMAIN2>\E$)s/@\Q<DOMAIN2>\E$//L
RULE:[2:$1@$0](.*@\Q<DOMAIN2>\E$)s/@\Q<DOMAIN2>\E$//L
DEFAULT
        at org.apache.hadoop.security.authentication.util.KerberosName.
parseRules(KerberosName.java:321)
        at org.apache.hadoop.security.authentication.util.KerberosName.
setRules(KerberosName.java:386)
        at org.apache.hadoop.security.HadoopKerberosName.setConfigurati
on(HadoopKerberosName.java:75)
        at org.apache.hadoop.security.UserGroupInformation.initialize(U
serGroupInformation.java:227)
        at org.apache.hadoop.security.UserGroupInformation.ensureInitia
lized(UserGroupInformation.java:214)
        at org.apache.hadoop.security.UserGroupInformation.isAuthentica
tionMethodEnabled(UserGroupInformation.java:275)
        at org.apache.hadoop.security.UserGroupInformation.isSecurityEn
abled(UserGroupInformation.java:269)
        at org.apache.hadoop.security.UserGroupInformation.loginUserFro
mKeytab(UserGroupInformation.java:820)
        at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpr
eter.java:539)