You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tajo.apache.org by Apache Wiki <wi...@apache.org> on 2013/09/09 07:56:35 UTC

[Tajo Wiki] Update of "GettingStarted" by HyunsikChoi

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Tajo Wiki" for change notification.

The "GettingStarted" page has been changed by HyunsikChoi:
https://wiki.apache.org/tajo/GettingStarted?action=diff&rev1=10&rev2=11

  export TAJO_HOME=<tajo-install-dir>
  }}}
  
+ Tajo provides two cluster running modes: On-demand mode using Yarn and Standby mode where Tajo works with its own resource manager. You should choose one mode of them.
+ 
+ === On-demand Mode ===
+ On-demand mode employs Hadoop Yarn as a primary cluster resource manager. In the on-demand mode, TajoMaster and QueryMaster ask Yarn resource manager to allocate container resources for each query. So, it is needed to add some configs to yarn-site.xml.
+ 
- Tajo requires an auxiliary service called PullServer for data repartitioning. For this, you must add or modify the following configuration parameters in $HADOOP_HOME/etc/hadoop/yarn-site.xml.
+ First of all, in on-demand mode, Tajo requires an auxiliary service called PullServer for data repartitioning. For this, you must add or modify the following configuration parameters in $HADOOP_HOME/etc/hadoop/yarn-site.xml.
  
  {{{
  <property>
@@ -78, +83 @@

    </property>
  
    <property>
-     <name>tajo.cluster.distributed</name>
-     <value>true</value>
+     <name>tajo.task.localdir</name>
+     <value>/tmp/tajo-localdir</value>
    </property>
  }}}
  
  If you want know configuration in more detail, read Configuration Guide.
+ 
+ === Standby Mode ===
+ In the standby mode, TajoMaster preempts the cluster resource and uses its own cluster resource manager called TajoWorkerResourceManager. TajoWorkerResourceManager coordinates and allocates cluster resources including CPU, memory, and disk to a query.
+ 
+ {{{
+   <property>
+     <name>tajo.rootdir</name>
+     <value>hdfs://hostname:port/tajo</value>
+   </property>
+ 
+   <property>
+     <name>tajo.task.localdir</name>
+     <value>/tmp/tajo-localdir</value>
+   </property>
+ 
+   <property>
+     <name>tajo.resource.manager</name>
+     <value>org.apache.tajo.master.rm.TajoWorkerResourceManager</value>
+   </property>
+ 
+   <property>
+     <name>tajo.worker.slots.use.os.info</name>
+     <value>false</value>
+     <description>If true, Tajo system obtains the physical resource information from OS. If false, the physical resource information is obtained from the below configs.
+     </description>
+   </property>
+ 
+   <property>
+     <name>tajo.worker.slots.memoryMB</name>
+     <value>5000</value>
+   </property>
+ 
+   <property>
+     <name>tajo.worker.slots.disk</name>
+     <value>4</value>
+     <description>The number of disks on a worker</description>
+   </property>
+ 
+   <property>
+     <name>tajo.worker.slots.disk.concurrency</name>
+     <value>4</value>
+     <description>the maximum concurrency number per disk slot</description>
+   </property>
+ 
+   <property>
+     <name>tajo.worker.slots.cpu.core</name>
+     <value>4</value>
+     <description>The number of CPU cores on a worker</description>
+   </property> 
+ }}}
  
  == Running Tajo ==
  Before launching the tajo, you should create the tajo root dir and set the permission as follows: