You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by "leosandylh@gmail.com" <le...@gmail.com> on 2014/01/13 17:29:48 UTC

Can you help me ?

HI,
I run a hql in hive set these params :
set hive.exec.parallel=true;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.input.format=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat;
set mapred.max.split.size=100000000;
set mapred.min.split.size.per.node=100000000;
set mapred.min.split.size.per.rack=100000000;
set tcl.name=cr_24hourdM.tcl;
set mapred.queue.name=tcl2;
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
set mapred.min.split.size=536870912;
set hive.exec.reducers.max=239;
set hive.exec.reducers.bytes.per.reducer=80000000;

Could I also set the same params in the cli when I run the hql in shark ?
or I should set shark.exec.mode=hive ?
What's the diff between the two mode ?

If I just run a hql without catch table, could I change spark.storage.memoryFraction=0.1 or smaller ?

THX ! 




leosandylh@gmail.com