You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ambari.apache.org by "Alejandro Fernandez (JIRA)" <ji...@apache.org> on 2015/03/03 01:04:06 UTC

[jira] [Commented] (AMBARI-9882) Pig Service check fails in a Kerberized cluster

    [ https://issues.apache.org/jira/browse/AMBARI-9882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14344069#comment-14344069 ] 

Alejandro Fernandez commented on AMBARI-9882:
---------------------------------------------

This occurs due to several things
1. params.py in Pig creates a partial object for HdfsDirectory where
{code}
hdfs_user=hdfs_principal_name if security_enabled else hdfs_user
{code}

2. The value above is then used by dynamic_variable_interpretation.py, which calls HdfsDirectory, and ends up kinit as the user whose value comes from hdfs_user. The problem is that "hdfs@EXAMPLE.COM" in a kerberized cluster is not a valid user to perform the kinit command as.
Further, in the function _copy_files, the call to CopyFromLocal also needs to execute the commands as ambari-qa, instead of hdfs@EXAMPLE.COM, even while granting file ownership permission to the latter user.

> Pig Service check fails in a Kerberized cluster
> -----------------------------------------------
>
>                 Key: AMBARI-9882
>                 URL: https://issues.apache.org/jira/browse/AMBARI-9882
>             Project: Ambari
>          Issue Type: Bug
>    Affects Versions: 2.0.0
>            Reporter: Alejandro Fernandez
>            Assignee: Alejandro Fernandez
>             Fix For: 2.0.0
>
>
> Deployed a cluster with HDFS, Mapreduce2, Yarn,  Tez, Pig, Zookeeper, Ambari Metrics, Kafka. 
> Enable security.
> After enable security Pig service check fails with below error. Here is the cluster to quickly take a look.
> {code}
> 2015-02-26 20:25:33,114 - Error while executing command 'service_check':
> Traceback (most recent call last):
>   File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 208, in execute
>     method(env)
>   File "/var/lib/ambari-agent/cache/common-services/PIG/0.12.0.2.0/package/scripts/service_check.py", line 98, in service_check
>     user      = params.smokeuser
>   File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 148, in __init__
>     self.env.run()
>   File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 152, in run
>     self.run_action(resource, action)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 118, in run_action
>     provider_action()
>   File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 276, in action_run
>     raise ex
> Fail: Execution of 'pig -x tez /var/lib/ambari-agent/data/tmp/pigSmoke.sh' returned 2. 15/02/26 20:25:23 INFO pig.ExecTypeProvider: Trying ExecType : LOCAL
> 15/02/26 20:25:23 INFO pig.ExecTypeProvider: Trying ExecType : MAPREDUCE
> 15/02/26 20:25:23 INFO pig.ExecTypeProvider: Trying ExecType : TEZ_LOCAL
> 15/02/26 20:25:23 INFO pig.ExecTypeProvider: Trying ExecType : TEZ
> 15/02/26 20:25:23 INFO pig.ExecTypeProvider: Picked TEZ as the ExecType
> 2015-02-26 20:25:23,404 [main] INFO  org.apache.pig.Main - Apache Pig version 0.14.0.2.2.2.0-2493 (rexported) compiled Feb 26 2015, 00:27:01
> 2015-02-26 20:25:23,404 [main] INFO  org.apache.pig.Main - Logging error messages to: /home/ambari-qa/pig_1424982323402.log
> 2015-02-26 20:25:24,864 [main] INFO  org.apache.pig.impl.util.Utils - Default bootup file /home/ambari-qa/.pigbootup not found
> 2015-02-26 20:25:25,239 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://ip-172-31-12-50.ec2.internal:8020
> 2015-02-26 20:25:28,207 [main] INFO  org.apache.hadoop.hdfs.DFSClient - Created HDFS_DELEGATION_TOKEN token 53 for ambari-qa on 172.31.12.50:8020
> 2015-02-26 20:25:28,294 [main] INFO  org.apache.hadoop.mapreduce.security.TokenCache - Got dt for hdfs://ip-172-31-12-50.ec2.internal:8020; Kind: HDFS_DELEGATION_TOKEN, Service: 172.31.12.50:8020, Ident: (HDFS_DELEGATION_TOKEN token 53 for ambari-qa)
> 2015-02-26 20:25:28,358 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: UNKNOWN
> 2015-02-26 20:25:28,508 [main] INFO  org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code.
> 2015-02-26 20:25:28,639 [main] INFO  org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, ConstantCalculator, GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, PartitionFilterOptimizer, PredicatePushdownOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter]}
> 2015-02-26 20:25:28,904 [main] INFO  org.apache.pig.backend.hadoop.executionengine.tez.TezLauncher - Tez staging directory is /tmp/temp830042415
> 2015-02-26 20:25:28,945 [main] INFO  org.apache.pig.backend.hadoop.executionengine.tez.plan.TezCompiler - File concatenation threshold: 100 optimistic? false
> 2015-02-26 20:25:29,154 [main] INFO  org.apache.tez.mapreduce.hadoop.MRInputHelpers - Generating mapreduce api input splits
> 2015-02-26 20:25:29,270 [main] INFO  org.apache.hadoop.hdfs.DFSClient - Created HDFS_DELEGATION_TOKEN token 54 for ambari-qa on 172.31.12.50:8020
> 2015-02-26 20:25:29,271 [main] INFO  org.apache.hadoop.mapreduce.security.TokenCache - Got dt for hdfs://ip-172-31-12-50.ec2.internal:8020; Kind: HDFS_DELEGATION_TOKEN, Service: 172.31.12.50:8020, Ident: (HDFS_DELEGATION_TOKEN token 54 for ambari-qa)
> 2015-02-26 20:25:29,275 [main] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
> 2015-02-26 20:25:29,276 [main] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
> 2015-02-26 20:25:29,347 [main] INFO  com.hadoop.compression.lzo.GPLNativeCodeLoader - Loaded native gpl library
> 2015-02-26 20:25:29,350 [main] INFO  com.hadoop.compression.lzo.LzoCodec - Successfully loaded & initialized native-lzo library [hadoop-lzo rev 66217595dd210805b8cf223b2d1b3b6f77fda073]
> 2015-02-26 20:25:29,358 [main] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1
> 2015-02-26 20:25:29,389 [main] INFO  org.apache.tez.mapreduce.hadoop.MRInputHelpers - NumSplits: 1, SerializedSize: 396
> 2015-02-26 20:25:29,987 [main] INFO  org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: guava-11.0.2.jar
> 2015-02-26 20:25:29,987 [main] INFO  org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: antlr-runtime-3.4.jar
> 2015-02-26 20:25:29,987 [main] INFO  org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: pig-0.14.0.2.2.2.0-2493-core-h2.jar
> 2015-02-26 20:25:29,988 [main] INFO  org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: joda-time-2.7.jar
> 2015-02-26 20:25:29,988 [main] INFO  org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: automaton-1.11-8.jar
> 2015-02-26 20:25:30,051 [main] INFO  org.apache.hadoop.hdfs.DFSClient - Created HDFS_DELEGATION_TOKEN token 55 for ambari-qa on 172.31.12.50:8020
> 2015-02-26 20:25:30,053 [main] INFO  org.apache.hadoop.mapreduce.security.TokenCache - Got dt for hdfs://ip-172-31-12-50.ec2.internal:8020; Kind: HDFS_DELEGATION_TOKEN, Service: 172.31.12.50:8020, Ident: (HDFS_DELEGATION_TOKEN token 55 for ambari-qa)
> 2015-02-26 20:25:30,196 [main] INFO  org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.shuffle.merge.percent to 0.66 from MR setting mapreduce.reduce.shuffle.merge.percent
> 2015-02-26 20:25:30,197 [main] INFO  org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.shuffle.fetch.buffer.percent to 0.7 from MR setting mapreduce.reduce.shuffle.input.buffer.percent
> 2015-02-26 20:25:30,197 [main] INFO  org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.io.sort.mb to 200 from MR setting mapreduce.task.io.sort.mb
> 2015-02-26 20:25:30,197 [main] INFO  org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.shuffle.memory.limit.percent to 0.25 from MR setting mapreduce.reduce.shuffle.memory.limit.percent
> 2015-02-26 20:25:30,197 [main] INFO  org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.io.sort.factor to 100 from MR setting mapreduce.task.io.sort.factor
> 2015-02-26 20:25:30,197 [main] INFO  org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.shuffle.connect.timeout to 180000 from MR setting mapreduce.reduce.shuffle.connect.timeout
> 2015-02-26 20:25:30,197 [main] INFO  org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.internal.sorter.class to org.apache.hadoop.util.QuickSort from MR setting map.sort.class
> 2015-02-26 20:25:30,197 [main] INFO  org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.merge.progress.records to 10000 from MR setting mapreduce.task.merge.progress.records
> 2015-02-26 20:25:30,198 [main] INFO  org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.compress to false from MR setting mapreduce.map.output.compress
> 2015-02-26 20:25:30,198 [main] INFO  org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.sort.spill.percent to 0.7 from MR setting mapreduce.map.sort.spill.percent
> 2015-02-26 20:25:30,198 [main] INFO  org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.shuffle.ssl.enable to false from MR setting mapreduce.shuffle.ssl.enabled
> 2015-02-26 20:25:30,198 [main] INFO  org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.ifile.readahead to true from MR setting mapreduce.ifile.readahead
> 2015-02-26 20:25:30,198 [main] INFO  org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.shuffle.parallel.copies to 30 from MR setting mapreduce.reduce.shuffle.parallelcopies
> 2015-02-26 20:25:30,199 [main] INFO  org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.ifile.readahead.bytes to 4194304 from MR setting mapreduce.ifile.readahead.bytes
> 2015-02-26 20:25:30,199 [main] INFO  org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.task.input.post-merge.buffer.percent to 0.0 from MR setting mapreduce.reduce.input.buffer.percent
> 2015-02-26 20:25:30,199 [main] INFO  org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.shuffle.read.timeout to 180000 from MR setting mapreduce.reduce.shuffle.read.timeout
> 2015-02-26 20:25:30,199 [main] INFO  org.apache.pig.backend.hadoop.executionengine.tez.util.MRToTezHelper - Setting tez.runtime.compress.codec to org.apache.hadoop.io.compress.DefaultCodec from MR setting mapreduce.map.output.compress.codec
> 2015-02-26 20:25:30,309 [main] INFO  org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - For vertex - scope-5: parallelism=1, memory=1024, java opts=-server -XX:NewRatio=8 -Djava.net.preferIPv4Stack=true -Dhdp.version=2.2.2.0-2493 -Xmx756m -Dlog4j.configuratorClass=org.apache.tez.common.TezLog4jConfigurator -Dlog4j.configuration=tez-container-log4j.properties -Dyarn.app.container.log.dir=<LOG_DIR> -Dtez.root.logger=INFO,CLA 
> 2015-02-26 20:25:30,510 [PigTezLauncher-0] INFO  org.apache.pig.tools.pigstats.tez.TezScriptState - Pig script settings are added to the job
> 2015-02-26 20:25:30,529 [PigTezLauncher-0] INFO  org.apache.tez.client.TezClient - Tez Client Version: [ component=tez-api, version=0.5.2.2.2.2.0-2493, revision=4caaebefdb9de0a1568ffa88cd29222b8347fad1, SCM-URL=scm:git:https://git-wip-us.apache.org/repos/asf/tez.git, buildTIme=20150225-2358 ]
> 2015-02-26 20:25:31,558 [PigTezLauncher-0] INFO  org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: http://ip-172-31-12-50.ec2.internal:8188/ws/v1/timeline/
> 2015-02-26 20:25:31,799 [PigTezLauncher-0] INFO  org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at ip-172-31-12-50.ec2.internal/172.31.12.50:8050
> 2015-02-26 20:25:32,029 [PigTezLauncher-0] INFO  org.apache.tez.client.TezClient - Session mode. Starting session.
> 2015-02-26 20:25:32,030 [PigTezLauncher-0] INFO  org.apache.tez.client.TezClientUtils - Using tez.lib.uris value from configuration: /hdp/apps/2.2.2.0-2493/tez/tez.tar.gz
> 2015-02-26 20:25:32,033 [PigTezLauncher-0] ERROR org.apache.pig.backend.hadoop.executionengine.tez.TezJob - Cannot submit DAG
> java.io.FileNotFoundException: File does not exist: /hdp/apps/2.2.2.0-2493/tez/tez.tar.gz
> 	at org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1140)
> 	at org.apache.hadoop.hdfs.DistributedFileSystem$19.doCall(DistributedFileSystem.java:1132)
> 	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> 	at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1132)
> 	at org.apache.hadoop.fs.FileSystem.resolvePath(FileSystem.java:750)
> 	at org.apache.tez.client.TezClientUtils.getLRFileStatus(TezClientUtils.java:127)
> 	at org.apache.tez.client.TezClientUtils.setupTezJarsLocalResources(TezClientUtils.java:178)
> 	at org.apache.tez.client.TezClient.getTezJarResources(TezClient.java:721)
> 	at org.apache.tez.client.TezClient.start(TezClient.java:298)
> 	at org.apache.pig.backend.hadoop.executionengine.tez.TezSessionManager.createSession(TezSessionManager.java:95)
> 	at org.apache.pig.backend.hadoop.executionengine.tez.TezSessionManager.getClient(TezSessionManager.java:195)
> 	at org.apache.pig.backend.hadoop.executionengine.tez.TezJob.run(TezJob.java:159)
> 	at org.apache.pig.backend.hadoop.executionengine.tez.TezLauncher$1.run(TezLauncher.java:167)
> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> 	at java.lang.Thread.run(Thread.java:745)
> 2015-02-26 20:25:32,468 [main] INFO  org.apache.pig.tools.pigstats.tez.TezPigScriptStats - Script Statistics:
>        HadoopVersion: 2.6.0.2.2.2.0-2493                                                                                  
>           PigVersion: 0.14.0.2.2.2.0-2493                                                                                 
>           TezVersion: 0.5.2.2.2.2.0-2493                                                                                  
>               UserId: ambari-qa                                                                                           
>             FileName: /var/lib/ambari-agent/data/tmp/pigSmoke.sh                                                          
>            StartedAt: 2015-02-26 20:25:28                                                                                 
>           FinishedAt: 2015-02-26 20:25:32                                                                                 
>             Features: UNKNOWN                                                                                             
> Failed!
> DAG PigLatin:pigSmoke.sh-0_scope-0:
>        ApplicationId: null                                                                                                
>   TotalLaunchedTasks: -1                                                                                                  
>        FileBytesRead: -1                                                                                                  
>     FileBytesWritten: -1                                                                                                  
>        HdfsBytesRead: 0                                                                                                   
>     HdfsBytesWritten: 0                                                                                                   
> Input(s):
> Output(s):
> 2015-02-26 20:25:32,471 [main] ERROR org.apache.pig.tools.grunt.GruntParser - ERROR 2244: Job failed, hadoop does not return any error message
> Details at logfile: /home/ambari-qa/pig_1424982323402.log
> 2015-02-26 20:25:32,507 [main] INFO  org.apache.pig.Main - Pig script completed in 9 seconds and 361 milliseconds (9361 ms)
> 2015-02-26 20:25:32,507 [main] INFO  org.apache.pig.backend.hadoop.executionengine.tez.TezLauncher - Shutting down thread pool
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)