You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by "liyang (JIRA)" <ji...@apache.org> on 2016/05/06 04:00:15 UTC
[jira] [Updated] (KYLIN-1515) Make Kylin run on MapR

     [ https://issues.apache.org/jira/browse/KYLIN-1515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

liyang updated KYLIN-1515:
--------------------------
    Summary: Make Kylin run on MapR  (was: Cube Build - java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses)

> Make Kylin run on MapR
> ----------------------
>
>                 Key: KYLIN-1515
>                 URL: https://issues.apache.org/jira/browse/KYLIN-1515
>             Project: Kylin
>          Issue Type: Bug
>          Components: Job Engine
>    Affects Versions: v1.5.0
>         Environment: MapR - Hadoop 2.5.1
>            Reporter: Richard Calaba
>            Assignee: Shaofeng SHI
>
> Knowing that MapR is not officially supported we were able to use Kylin 1.2 in our MapR distro successfully. 
> After upgrade to Kylin 1.5.0 we are facing issue with the Cube Build process - the one which worked on 1.2 without issues. The Cube is created from scratch (no Kylin metadata migration) on clean install of Kylinn1.5.0 (HDFS directory /kytlin and HBase tables KYLIN* and kylin* deleted prior upgrade from 1.2 to 1.5.0).
> The build process is Failing in Step 1 complaining about property value "mapreduce.framework.name". According to this post https://stackoverflow.com/questions/19642862/cannot-initialize-cluster-exception-while-running-job-on-hadoop-2 - the solution should be to ensure the respective property is correctly set in the file mapred-site.xml.
> Originally in our MapR distro the property was commented (and having value yarn-tez) - even after adding the "yarn" value -> the Build process still fails with same exception - I am not sure what is wrong with our cluster configuration.  Anyone has an idea ???
> Below is our mapred-site.xml content:
> ==============================
> cat /opt/mapr/hadoop/hadoop-2.5.1/etc/hadoop/mapred-site.xml
> <!-- Put site-specific property overrides in this file. -->
> <configuration>
>   <property>
>     <name>mapreduce.jobhistory.address</name>
>     <value>node1:10020</value>
>   </property>
>   <property>
>     <name>mapreduce.jobhistory.webapp.address</name>
>     <value>node1:19888</value>
>   </property>
>   <!--
>   <property>
>     <name>mapreduce.framework.name</name>
>     <value>yarn-tez</value>
>   </property>
>   -->
>   <property>
>     <name>mapreduce.framework.name</name>
>     <value>yarn</value>
>   </property>
> </configuration>
> Known workaround:
> ================
> Know workaround to make this error to disappear is to delete from conf/kylin_hive_conf.xml this property section:
> <property>
> <name>dfs.block.size</name>
> <value>32000000</value>
> <description>Want more mappers for in-mem cubing, thus smaller the DFS block size</description>
> </property>
> The full log output of Cube Build Step 1 - attached below: 
> ==============================================
> OS command error exit with 1 -- hive -e "USE default;
> DROP TABLE IF EXISTS kylin_intermediate_TestCube_clone2_19700101000000_2922789940817071255;
> CREATE EXTERNAL TABLE IF NOT EXISTS kylin_intermediate_TestCube_clone2_19700101000000_2922789940817071255
> (
> DEFAULT_BATTING_PLAYER_ID string
> ,DEFAULT_BATTING_YEAR int
> ,DEFAULT_BATTING_RUNS int
> )
> ROW FORMAT DELIMITED FIELDS TERMINATED BY '\177'
> STORED AS SEQUENCEFILE
> LOCATION '/kylin/kylin_metadata/kylin-3eb4b652-a2a4-4659-8b6a-dc822e1341fb/kylin_intermediate_TestCube_clone2_19700101000000_2922789940817071255';
> SET dfs.replication=2;
> SET dfs.block.size=32000000;
> SET hive.exec.compress.output=true;
> SET hive.auto.convert.join.noconditionaltask=true;
> SET hive.auto.convert.join.noconditionaltask.size=300000000;
> SET mapreduce.map.output.compress.codec=org.apache.hadoop.io.compress.SnappyCodec;
> SET mapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.SnappyCodec;
> SET hive.merge.mapfiles=true;
> SET hive.merge.mapredfiles=true;
> SET mapred.output.compression.type=BLOCK;
> SET hive.merge.size.per.task=256000000;
> SET hive.support.concurrency=false;
> SET mapreduce.job.split.metainfo.maxsize=-1;
> INSERT OVERWRITE TABLE kylin_intermediate_TestCube_clone2_19700101000000_2922789940817071255 SELECT
> BATTING.PLAYER_ID
> ,BATTING.YEAR
> ,BATTING.RUNS
> FROM DEFAULT.BATTING as BATTING 
> LEFT JOIN DEFAULT.TEMP_BATTING as TEMP_BATTING
> ON BATTING.PLAYER_ID = TEMP_BATTING.COL_VALUE
> ;
> "
> Logging initialized using configuration in jar:file:/opt/mapr/hive/hive-1.0/lib/hive-common-1.0.0-mapr-1510.jar!/hive-log4j.properties
> OK
> Time taken: 0.611 seconds
> OK
> Time taken: 0.83 seconds
> OK
> Time taken: 0.474 seconds
> Query ID = mapr_20160321201212_610078b4-5805-43eb-8fd1-87304530a84e
> Total jobs = 3
> 2016-03-21 08:12:32	Starting to launch local task to process map join;	maximum memory = 477102080
> 2016-03-21 08:12:32	Dump the side-table for tag: 1 with group count: 95196 into file: file:/tmp/mapr/b35c5ac2-3231-4ef1-9e6b-216c0a1bd9ef/hive_2016-03-21_20-12-31_085_8296009472449837835-1/-local-10003/HashTable-Stage-9/MapJoin-mapfile01--.hashtable
> 2016-03-21 08:12:32	Uploaded 1 File to: file:/tmp/mapr/b35c5ac2-3231-4ef1-9e6b-216c0a1bd9ef/hive_2016-03-21_20-12-31_085_8296009472449837835-1/-local-10003/HashTable-Stage-9/MapJoin-mapfile01--.hashtable (7961069 bytes)
> 2016-03-21 08:12:32	End of local task; Time Taken: 0.853 sec.
> Launching Job 1 out of 3
> Number of reduce tasks is set to 0 since there's no reduce operator
> java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.
> 	at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:121)
> 	at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:83)
> 	at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:76)
> 	at org.apache.hadoop.mapred.JobClient.init(JobClient.java:470)
> 	at org.apache.hadoop.mapred.JobClient.<init>(JobClient.java:449)
> 	at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:399)
> 	at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:137)
> 	at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
> 	at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
> 	at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1619)
> 	at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1379)
> 	at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1192)
> 	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1019)
> 	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1009)
> 	at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:201)
> 	at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:153)
> 	at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:364)
> 	at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:299)
> 	at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:662)
> 	at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:631)
> 	at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:570)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:606)
> 	at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> Job Submission failed with exception 'java.io.IOException(Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.)'
> FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)