You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@tajo.apache.org by Azuryy Yu <az...@gmail.com> on 2015/01/16 06:37:26 UTC

Some Tajo-0.9.0 questions

Hi,

I tested Tajo before half a year, then not focus on Tajo because some other
works.

then I setup a small dev Tajo cluster this week.(six nodes, VM) based on
Hadoop-2.6.0.

so my questions is:

1) From I know half a yea ago, Tajo is work on Yarn, using Yarn scheduler
to manage  job resources. but now I found it doesn't rely on Yarn, because
I only start HDFS daemons, no yarn daemons. so Tajo has his own job
sheduler ?

2) Does that we need to put the file replications on every nodes on Tajo
cluster?
such as I have a six nodes Tajo cluster, then should I set HDFS block
replication to six? because:

I noticed when I run Tajo query, some nodes are busy, but some is free.
because the file's blocks are only located on these nodes. non others.

3)the test data set is 4 million rows. nearly several GB. but it's very
slow when I runing: select count(distinct ID) from ****;
Any possible problems here?


Thanks

Re: Some Tajo-0.9.0 questions

Posted by Azuryy Yu <az...@gmail.com>.

Hi Hyunsik,

I appreciate your detail explanation,  It' very helpful. Thanks.


On Thu, Jan 22, 2015 at 10:47 PM, Hyunsik Choi <hy...@apache.org> wrote:

> Hi Azuryy,
>
> I just know more the best practices. Since the resource model of Tajo
> is being evolved rapidly, the current model is not completed and is
> not intuitive in my opinion. So, you may be hard to set the best
> resources configs to Tajo.
>
> Even though the resource model includes the number of CPU cores,
> memory size, and disk resources, its essential thing is to determine
> the number of concurrent tasks in each machine.
>
> The proper number of concurrent tasks can be derived from hardware
> resource capacity. The followings come from my experiences:
>
>  * (Disk) Performance has been the best when each SATA disk takes 2
> concurrent tasks at the same time.
>  * (CPU) Performance has been the best in most cases when the
> concurrent tasks = the number of cores
>  * (Memory) Each task consumes 2 ~ 4 GB. Memory consumption for each
> task varies in workloads. More than 2 GB has been showed stable and
> efficient performance even in heavy workloads and long time workloads.
>
> Note that the smallest number of concurrent tasks derived from tree
> kinds of hardware capacity determines the best concurrent tasks in the
> machine. For example, consider one machine which is equipped with 2
> CPU cores and 12 disks. In this cases, 2 CPU will determine the best
> number of concurrent tasks in the machine because 2 concurrent tasks
> of CPU cores is fewer than 24 concurrent tasks of 12 disks.
>
> Let's get back to the discussion of your machine. If we assume that
> the entire resources of your physical node (24cpu, 64G mem,  4T*12
> HDD) are used for a Tajo, I would think as follows:
>
>  * 24 concurrent tasks is the best in 24 cores.
>  * 24 tasks would be the best in 12 disks if the disks are SATA. If
> they are SAS disks, each disk can accept more than 2. 36 may be the
> best for 12 disks.
>  * Memory seems to be most scarce resource in all resources. OS and
> other daemons should use few GBs. So, I assume only 60 GBs may be
> available for Tajo. I think that 3GB seems to be the best for each
> task. In terms of memory size, 20 concurrent tasks (60 GB / 3GB) in
> the machine seems to be proper number.
>
> The lowest concurrency number is 20 by 64GB memory. So, the following
> configs may be proper to your physical node.
>
> tajo-site.xml
> <!--  worker  -->
> <property>
>   <name>tajo.worker.resource.memory-mb</name>
>   <value>60512</value> <!--  20 tasks + 1 qm (i.e, 3000 * 20 + 512 * 1)
> -->
> </property>
> <property>
>   <name>tajo.worker.resource.disks</name>
>   <value>20</value> <!--  20 tasks (20 * 1.0f)  -->
> </property>
>
> <property>
>   <name>tajo.task.memory-slot-mb.default</name>
>   <value>3000</value> <!--  default 512 -->
> </property>
> <property>
>   <name>tajo.task.disk-slot.default</name>
>   <value>1.0f</value> <!--  default 0.5 -->
> </property>
>
>
> tajo-env.sh
> TAJO_WORKER_HEAPSIZE=60000
>
> The above config is for a single physical node dedicated for a Tajo
> worker. The configs for virtual machine may be different because each
> of them use OS and other daemons.
>
> I hope that my suggestion would be helpful to you.
>
> Best regards,
> Hyunsik
>
> On Mon, Jan 19, 2015 at 5:07 PM, Azuryy Yu <az...@gmail.com> wrote:
> > Yes Hyunsik,
> >
> > but that's all I know from the Tajo website. I guess there are more
> default
> > configurations but not showed on the Tajo wiki, right?
> >
> >
> >
> >
> > On Mon, Jan 19, 2015 at 4:52 PM, Hyunsik Choi <hy...@apache.org>
> wrote:
> >
> >> Thank you for sharing the machine information.
> >>
> >> In my opinion, we can boost up Tajo performance very much in the
> >> machine with proper configuration if the server is dedicated for Tajo.
> >> I think that the configuration that we mentioned above only uses some
> >> of the physical resources in the machine :)
> >>
> >> Warm regards,
> >> Hyunsik
> >>
> >> On Sun, Jan 18, 2015 at 8:10 PM, Azuryy Yu <az...@gmail.com> wrote:
> >> > Thanks Tyunsik.
> >> >
> >> > I asked our infra team, my 6 nodes Tajo cluster were visulized from
> one
> >> > host. that's mean I run 6 nodes Tajo cluster on one phisical
> host.(24cpu,
> >> > 64G mem,  4T*12 HDD)
> >> >
> >> > so I think this was the real performance bottle neck.
> >> >
> >> >
> >> >
> >> > On Mon, Jan 19, 2015 at 11:12 AM, Hyunsik Choi <hy...@apache.org>
> >> wrote:
> >> >
> >> >> Hi Azuryy,
> >> >>
> >> >> Tajo automatically rewrites distinct aggregation queries into
> >> >> multi-level aggregations. The query rewrite that Jinho suggested may
> >> >> be already involved.
> >> >>
> >> >> I think that your query response times (12 ~ 15 secs) for distinct
> >> >> count seems to be reasonable because just count aggregation takes 5
> >> >> secs. Usually, distinct aggregation queries are much more slower than
> >> >> just aggregation queries because distinct aggregation involves sort,
> >> >> large intermediate data, and only distinct value handling.
> >> >>
> >> >> In addition, I have a question for more better configuration guide.
> >> >> Could you share available CPU, memory and disks for Tajo?
> >> >>
> >> >> Even though Jinho suggested one, there is still room to set exact and
> >> >> better configurations. Since the resource configuration determines
> the
> >> >> number of concurrent tasks, it may be main cause of your performance
> >> >> problem.
> >> >>
> >> >> Best regards,
> >> >> Hyunsik
> >> >>
> >> >> On Sun, Jan 18, 2015 at 6:54 PM, Jinho Kim <jh...@apache.org> wrote:
> >> >> >  Sorry for my mistake example query.
> >> >> > Can you change to “select count(a.auid) from ( select auid from
> >> >> > test_pl_00_0 group by auid ) a;” ?
> >> >> >
> >> >> > -Jinho
> >> >> > Best regards
> >> >> >
> >> >> > 2015-01-19 11:44 GMT+09:00 Azuryy Yu <az...@gmail.com>:
> >> >> >
> >> >> >> Sorry for no response during weekend.
> >> >> >> I changed hdfs-site.xml and restart hdfs and tajo.but  It's more
> slow
> >> >> than
> >> >> >> before.
> >> >> >>
> >> >> >> default> select count(a.auid) from ( select auid from
> test_pl_00_0 )
> >> a;
> >> >> >> Progress: 0%, response time: 1.132 sec
> >> >> >> Progress: 0%, response time: 1.134 sec
> >> >> >> Progress: 0%, response time: 1.536 sec
> >> >> >> Progress: 0%, response time: 2.338 sec
> >> >> >> Progress: 0%, response time: 3.341 sec
> >> >> >> Progress: 3%, response time: 4.343 sec
> >> >> >> Progress: 4%, response time: 5.346 sec
> >> >> >> Progress: 9%, response time: 6.35 sec
> >> >> >> Progress: 11%, response time: 7.352 sec
> >> >> >> Progress: 16%, response time: 8.354 sec
> >> >> >> Progress: 18%, response time: 9.362 sec
> >> >> >> Progress: 24%, response time: 10.364 sec
> >> >> >> Progress: 27%, response time: 11.366 sec
> >> >> >> Progress: 29%, response time: 12.368 sec
> >> >> >> Progress: 32%, response time: 13.37 sec
> >> >> >> Progress: 37%, response time: 14.373 sec
> >> >> >> Progress: 40%, response time: 15.377 sec
> >> >> >> Progress: 42%, response time: 16.379 sec
> >> >> >> Progress: 42%, response time: 17.382 sec
> >> >> >> Progress: 43%, response time: 18.384 sec
> >> >> >> Progress: 43%, response time: 19.386 sec
> >> >> >> Progress: 45%, response time: 20.388 sec
> >> >> >> Progress: 45%, response time: 21.391 sec
> >> >> >> Progress: 46%, response time: 22.393 sec
> >> >> >> Progress: 46%, response time: 23.395 sec
> >> >> >> Progress: 48%, response time: 24.398 sec
> >> >> >> Progress: 48%, response time: 25.401 sec
> >> >> >> Progress: 50%, response time: 26.403 sec
> >> >> >> Progress: 100%, response time: 26.95 sec
> >> >> >> ?count
> >> >> >> -------------------------------
> >> >> >> 4487999
> >> >> >> (1 rows, 26.95 sec, 8 B selected)
> >> >> >> default> select count(distinct auid) from test_pl_00_0;
> >> >> >> Progress: 0%, response time: 0.88 sec
> >> >> >> Progress: 0%, response time: 0.881 sec
> >> >> >> Progress: 0%, response time: 1.283 sec
> >> >> >> Progress: 0%, response time: 2.086 sec
> >> >> >> Progress: 0%, response time: 3.088 sec
> >> >> >> Progress: 0%, response time: 4.09 sec
> >> >> >> Progress: 25%, response time: 5.092 sec
> >> >> >> Progress: 33%, response time: 6.094 sec
> >> >> >> Progress: 50%, response time: 7.096 sec
> >> >> >> Progress: 50%, response time: 8.098 sec
> >> >> >> Progress: 50%, response time: 9.099 sec
> >> >> >> Progress: 66%, response time: 10.101 sec
> >> >> >> Progress: 66%, response time: 11.103 sec
> >> >> >> Progress: 83%, response time: 12.105 sec
> >> >> >> Progress: 100%, response time: 12.268 sec
> >> >> >> ?count
> >> >> >> -------------------------------
> >> >> >> 1222356
> >> >> >> (1 rows, 12.268 sec, 8 B selected)
> >> >> >>
> >> >> >> On Sat, Jan 17, 2015 at 11:00 PM, Jinho Kim <jh...@apache.org>
> >> wrote:
> >> >> >>
> >> >> >> >  Thank you for your sharing
> >> >> >> >
> >> >> >> > Can you enable the dfs.datanode.hdfs-blocks-metadata.enabled in
> >> >> >> > hdfs-site.xml ?
> >> >> >> > If you enable the block-metadata, tajo-cluster can use the
> volume
> >> load
> >> >> >> > balancing. You should restart the datanode and tajo cluster. I
> will
> >> >> >> > investigate performance of count-distinct operator. and You can
> >> >> change to
> >> >> >> > “select count(a.auid) from ( select auid from test_pl_00_0 ) a”
> >> >> >> >
> >> >> >> >
> >> >> >> > -Jinho
> >> >> >> > Best regards
> >> >> >> >
> >> >> >> > 2015-01-16 18:05 GMT+09:00 Azuryy Yu <az...@gmail.com>:
> >> >> >> >
> >> >> >> > > default> select count(*) from test_pl_00_0;
> >> >> >> > > Progress: 0%, response time: 0.718 sec
> >> >> >> > > Progress: 0%, response time: 0.72 sec
> >> >> >> > > Progress: 0%, response time: 1.121 sec
> >> >> >> > > Progress: 12%, response time: 1.923 sec
> >> >> >> > > Progress: 28%, response time: 2.925 sec
> >> >> >> > > Progress: 41%, response time: 3.927 sec
> >> >> >> > > Progress: 50%, response time: 4.931 sec
> >> >> >> > > Progress: 100%, response time: 5.323 sec
> >> >> >> > > 2015-01-16T17:04:41.116+0800: [GC2015-01-16T17:04:41.116+0800:
> >> >> [ParNew:
> >> >> >> > > 26543K->6211K(31488K), 0.0079770 secs] 26543K->6211K(115456K),
> >> >> >> 0.0080700
> >> >> >> > > secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
> >> >> >> > > 2015-01-16T17:04:41.303+0800: [GC2015-01-16T17:04:41.303+0800:
> >> >> [ParNew:
> >> >> >> > > 27203K->7185K(31488K), 0.0066950 secs] 27203K->7185K(115456K),
> >> >> >> 0.0068130
> >> >> >> > > secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
> >> >> >> > > 2015-01-16T17:04:41.504+0800: [GC2015-01-16T17:04:41.504+0800:
> >> >> [ParNew:
> >> >> >> > > 28177K->5597K(31488K), 0.0091630 secs] 28177K->6523K(115456K),
> >> >> >> 0.0092430
> >> >> >> > > secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
> >> >> >> > > 2015-01-16T17:04:41.778+0800: [GC2015-01-16T17:04:41.778+0800:
> >> >> [ParNew:
> >> >> >> > > 26589K->6837K(31488K), 0.0067280 secs] 27515K->7764K(115456K),
> >> >> >> 0.0068160
> >> >> >> > > secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
> >> >> >> > > ?count
> >> >> >> > > -------------------------------
> >> >> >> > > 4487999
> >> >> >> > > (1 rows, 5.323 sec, 8 B selected)
> >> >> >> > >
> >> >> >> > > On Fri, Jan 16, 2015 at 5:03 PM, Azuryy Yu <
> azuryyyu@gmail.com>
> >> >> wrote:
> >> >> >> > >
> >> >> >> > > > Hi,
> >> >> >> > > > There is no big improvement, sometimes more slower than
> >> before. I
> >> >> >> also
> >> >> >> > > try
> >> >> >> > > > to increase worker's heap size and parallel, nothing
> improve.
> >> >> >> > > >
> >> >> >> > > > default> select count(distinct auid) from test_pl_00_0;
> >> >> >> > > > Progress: 0%, response time: 0.963 sec
> >> >> >> > > > Progress: 0%, response time: 0.964 sec
> >> >> >> > > > Progress: 0%, response time: 1.366 sec
> >> >> >> > > > Progress: 0%, response time: 2.168 sec
> >> >> >> > > > Progress: 0%, response time: 3.17 sec
> >> >> >> > > > Progress: 0%, response time: 4.172 sec
> >> >> >> > > > Progress: 16%, response time: 5.174 sec
> >> >> >> > > > Progress: 16%, response time: 6.176 sec
> >> >> >> > > > Progress: 16%, response time: 7.178 sec
> >> >> >> > > > Progress: 33%, response time: 8.18 sec
> >> >> >> > > > Progress: 50%, response time: 9.181 sec
> >> >> >> > > > Progress: 50%, response time: 10.183 sec
> >> >> >> > > > Progress: 50%, response time: 11.185 sec
> >> >> >> > > > Progress: 50%, response time: 12.187 sec
> >> >> >> > > > Progress: 66%, response time: 13.189 sec
> >> >> >> > > > Progress: 66%, response time: 14.19 sec
> >> >> >> > > > Progress: 100%, response time: 15.003 sec
> >> >> >> > > > 2015-01-16T17:00:56.410+0800:
> [GC2015-01-16T17:00:56.410+0800:
> >> >> >> [ParNew:
> >> >> >> > > > 26473K->6582K(31488K), 0.0105030 secs]
> 26473K->6582K(115456K),
> >> >> >> > 0.0105720
> >> >> >> > > > secs] [Times: user=0.04 sys=0.00, real=0.01 secs]
> >> >> >> > > > 2015-01-16T17:00:56.593+0800:
> [GC2015-01-16T17:00:56.593+0800:
> >> >> >> [ParNew:
> >> >> >> > > > 27574K->6469K(31488K), 0.0086300 secs]
> 27574K->6469K(115456K),
> >> >> >> > 0.0086940
> >> >> >> > > > secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
> >> >> >> > > > 2015-01-16T17:00:56.800+0800:
> [GC2015-01-16T17:00:56.800+0800:
> >> >> >> [ParNew:
> >> >> >> > > > 27461K->5664K(31488K), 0.0122560 secs]
> 27461K->6591K(115456K),
> >> >> >> > 0.0123210
> >> >> >> > > > secs] [Times: user=0.02 sys=0.01, real=0.01 secs]
> >> >> >> > > > 2015-01-16T17:00:57.065+0800:
> [GC2015-01-16T17:00:57.065+0800:
> >> >> >> [ParNew:
> >> >> >> > > > 26656K->6906K(31488K), 0.0070520 secs]
> 27583K->7833K(115456K),
> >> >> >> > 0.0071470
> >> >> >> > > > secs] [Times: user=0.03 sys=0.00, real=0.01 secs]
> >> >> >> > > > ?count
> >> >> >> > > > -------------------------------
> >> >> >> > > > 1222356
> >> >> >> > > > (1 rows, 15.003 sec, 8 B selected)
> >> >> >> > > >
> >> >> >> > > >
> >> >> >> > > > On Fri, Jan 16, 2015 at 4:09 PM, Azuryy Yu <
> azuryyyu@gmail.com
> >> >
> >> >> >> wrote:
> >> >> >> > > >
> >> >> >> > > >> Thanks Kim, I'll try and post back.
> >> >> >> > > >>
> >> >> >> > > >> On Fri, Jan 16, 2015 at 4:02 PM, Jinho Kim <
> jhkim@apache.org>
> >> >> >> wrote:
> >> >> >> > > >>
> >> >> >> > > >>> Thanks Azuryy Yu
> >> >> >> > > >>>
> >> >> >> > > >>> Your parallel running tasks of tajo-worker is 10 but heap
> >> >> memory is
> >> >> >> > > 3GB.
> >> >> >> > > >>> It
> >> >> >> > > >>> cause a long JVM pause
> >> >> >> > > >>> I recommend following :
> >> >> >> > > >>>
> >> >> >> > > >>> tajo-env.sh
> >> >> >> > > >>> TAJO_WORKER_HEAPSIZE=3000 or more
> >> >> >> > > >>>
> >> >> >> > > >>> tajo-site.xml
> >> >> >> > > >>> <!--  worker  -->
> >> >> >> > > >>> <property>
> >> >> >> > > >>>   <name>tajo.worker.resource.memory-mb</name>
> >> >> >> > > >>>   <value>3512</value> <!--  3 tasks + 1 qm task  -->
> >> >> >> > > >>> </property>
> >> >> >> > > >>> <property>
> >> >> >> > > >>>   <name>tajo.task.memory-slot-mb.default</name>
> >> >> >> > > >>>   <value>1000</value> <!--  default 512 -->
> >> >> >> > > >>> </property>
> >> >> >> > > >>> <property>
> >> >> >> > > >>>    <name>tajo.worker.resource.dfs-dir-aware</name>
> >> >> >> > > >>>    <value>true</value>
> >> >> >> > > >>> </property>
> >> >> >> > > >>> <!--  end  -->
> >> >> >> > > >>>
> >> >> >> > >
> >> >> >> >
> >> >> >>
> >> >>
> >>
> http://tajo.apache.org/docs/0.9.0/configuration/worker_configuration.html
> >> >> >> > > >>>
> >> >> >> > > >>> -Jinho
> >> >> >> > > >>> Best regards
> >> >> >> > > >>>
> >> >> >> > > >>> 2015-01-16 16:02 GMT+09:00 Azuryy Yu <azuryyyu@gmail.com
> >:
> >> >> >> > > >>>
> >> >> >> > > >>> > Thanks Kim.
> >> >> >> > > >>> >
> >> >> >> > > >>> > The following is my tajo-env and tajo-site
> >> >> >> > > >>> >
> >> >> >> > > >>> > *tajo-env.sh:*
> >> >> >> > > >>> > export HADOOP_HOME=/usr/local/hadoop
> >> >> >> > > >>> > export JAVA_HOME=/usr/local/java
> >> >> >> > > >>> > _TAJO_OPTS="-server -verbose:gc
> >> >> >> > > >>> >   -XX:+PrintGCDateStamps
> >> >> >> > > >>> >   -XX:+PrintGCDetails
> >> >> >> > > >>> >   -XX:+UseGCLogFileRotation
> >> >> >> > > >>> >   -XX:NumberOfGCLogFiles=9
> >> >> >> > > >>> >   -XX:GCLogFileSize=256m
> >> >> >> > > >>> >   -XX:+DisableExplicitGC
> >> >> >> > > >>> >   -XX:+UseCompressedOops
> >> >> >> > > >>> >   -XX:SoftRefLRUPolicyMSPerMB=0
> >> >> >> > > >>> >   -XX:+UseFastAccessorMethods
> >> >> >> > > >>> >   -XX:+UseParNewGC
> >> >> >> > > >>> >   -XX:+UseConcMarkSweepGC
> >> >> >> > > >>> >   -XX:+CMSParallelRemarkEnabled
> >> >> >> > > >>> >   -XX:CMSInitiatingOccupancyFraction=70
> >> >> >> > > >>> >   -XX:+UseCMSCompactAtFullCollection
> >> >> >> > > >>> >   -XX:CMSFullGCsBeforeCompaction=0
> >> >> >> > > >>> >   -XX:+CMSClassUnloadingEnabled
> >> >> >> > > >>> >   -XX:CMSMaxAbortablePrecleanTime=300
> >> >> >> > > >>> >   -XX:+CMSScavengeBeforeRemark
> >> >> >> > > >>> >   -XX:PermSize=160m
> >> >> >> > > >>> >   -XX:GCTimeRatio=19
> >> >> >> > > >>> >   -XX:SurvivorRatio=2
> >> >> >> > > >>> >   -XX:MaxTenuringThreshold=60"
> >> >> >> > > >>> > _TAJO_MASTER_OPTS="$_TAJO_OPTS -Xmx512m -Xms512m
> -Xmn256m"
> >> >> >> > > >>> > _TAJO_WORKER_OPTS="$_TAJO_OPTS -Xmx3g -Xms3g -Xmn1g"
> >> >> >> > > >>> > _TAJO_QUERYMASTER_OPTS="$_TAJO_OPTS -Xmx512m -Xms512m
> >> >> -Xmn256m"
> >> >> >> > > >>> > export TAJO_OPTS=$_TAJO_OPTS
> >> >> >> > > >>> > export TAJO_MASTER_OPTS=$_TAJO_MASTER_OPTS
> >> >> >> > > >>> > export TAJO_WORKER_OPTS=$_TAJO_WORKER_OPTS
> >> >> >> > > >>> > export TAJO_QUERYMASTER_OPTS=$_TAJO_QUERYMASTER_OPTS
> >> >> >> > > >>> > export TAJO_LOG_DIR=${TAJO_HOME}/logs
> >> >> >> > > >>> > export TAJO_PID_DIR=${TAJO_HOME}/pids
> >> >> >> > > >>> > export TAJO_WORKER_STANDBY_MODE=true
> >> >> >> > > >>> >
> >> >> >> > > >>> > *tajo-site.xml:*
> >> >> >> > > >>> >
> >> >> >> > > >>> > <configuration>
> >> >> >> > > >>> >   <property>
> >> >> >> > > >>> >     <name>tajo.rootdir</name>
> >> >> >> > > >>> >     <value>hdfs://test-cluster/tajo</value>
> >> >> >> > > >>> >   </property>
> >> >> >> > > >>> >   <property>
> >> >> >> > > >>> >     <name>tajo.master.umbilical-rpc.address</name>
> >> >> >> > > >>> >     <value>10-0-86-51:26001</value>
> >> >> >> > > >>> >   </property>
> >> >> >> > > >>> >   <property>
> >> >> >> > > >>> >     <name>tajo.master.client-rpc.address</name>
> >> >> >> > > >>> >     <value>10-0-86-51:26002</value>
> >> >> >> > > >>> >   </property>
> >> >> >> > > >>> >   <property>
> >> >> >> > > >>> >     <name>tajo.resource-tracker.rpc.address</name>
> >> >> >> > > >>> >     <value>10-0-86-51:26003</value>
> >> >> >> > > >>> >   </property>
> >> >> >> > > >>> >   <property>
> >> >> >> > > >>> >     <name>tajo.catalog.client-rpc.address</name>
> >> >> >> > > >>> >     <value>10-0-86-51:26005</value>
> >> >> >> > > >>> >   </property>
> >> >> >> > > >>> >   <property>
> >> >> >> > > >>> >     <name>tajo.worker.tmpdir.locations</name>
> >> >> >> > > >>> >     <value>/test/tajo1,/test/tajo2,/test/tajo3</value>
> >> >> >> > > >>> >   </property>
> >> >> >> > > >>> >   <!--  worker  -->
> >> >> >> > > >>> >   <property>
> >> >> >> > > >>> >
> >> >> >> >
> <name>tajo.worker.resource.tajo.worker.resource.cpu-cores</name>
> >> >> >> > > >>> >     <value>4</value>
> >> >> >> > > >>> >   </property>
> >> >> >> > > >>> >  <property>
> >> >> >> > > >>> >    <name>tajo.worker.resource.memory-mb</name>
> >> >> >> > > >>> >    <value>5120</value>
> >> >> >> > > >>> >  </property>
> >> >> >> > > >>> >   <property>
> >> >> >> > > >>> >     <name>tajo.worker.resource.dfs-dir-aware</name>
> >> >> >> > > >>> >     <value>true</value>
> >> >> >> > > >>> >   </property>
> >> >> >> > > >>> >   <property>
> >> >> >> > > >>> >     <name>tajo.worker.resource.dedicated</name>
> >> >> >> > > >>> >     <value>true</value>
> >> >> >> > > >>> >   </property>
> >> >> >> > > >>> >   <property>
> >> >> >> > > >>> >
> >>  <name>tajo.worker.resource.dedicated-memory-ratio</name>
> >> >> >> > > >>> >     <value>0.6</value>
> >> >> >> > > >>> >   </property>
> >> >> >> > > >>> > </configuration>
> >> >> >> > > >>> >
> >> >> >> > > >>> > On Fri, Jan 16, 2015 at 2:50 PM, Jinho Kim <
> >> jhkim@apache.org>
> >> >> >> > wrote:
> >> >> >> > > >>> >
> >> >> >> > > >>> > > Hello Azuyy yu
> >> >> >> > > >>> > >
> >> >> >> > > >>> > > I left some comments.
> >> >> >> > > >>> > >
> >> >> >> > > >>> > > -Jinho
> >> >> >> > > >>> > > Best regards
> >> >> >> > > >>> > >
> >> >> >> > > >>> > > 2015-01-16 14:37 GMT+09:00 Azuryy Yu <
> azuryyyu@gmail.com
> >> >:
> >> >> >> > > >>> > >
> >> >> >> > > >>> > > > Hi,
> >> >> >> > > >>> > > >
> >> >> >> > > >>> > > > I tested Tajo before half a year, then not focus on
> >> Tajo
> >> >> >> > because
> >> >> >> > > >>> some
> >> >> >> > > >>> > > other
> >> >> >> > > >>> > > > works.
> >> >> >> > > >>> > > >
> >> >> >> > > >>> > > > then I setup a small dev Tajo cluster this week.(six
> >> >> nodes,
> >> >> >> VM)
> >> >> >> > > >>> based
> >> >> >> > > >>> > on
> >> >> >> > > >>> > > > Hadoop-2.6.0.
> >> >> >> > > >>> > > >
> >> >> >> > > >>> > > > so my questions is:
> >> >> >> > > >>> > > >
> >> >> >> > > >>> > > > 1) From I know half a yea ago, Tajo is work on Yarn,
> >> using
> >> >> >> Yarn
> >> >> >> > > >>> > scheduler
> >> >> >> > > >>> > > > to manage  job resources. but now I found it doesn't
> >> rely
> >> >> on
> >> >> >> > > Yarn,
> >> >> >> > > >>> > > because
> >> >> >> > > >>> > > > I only start HDFS daemons, no yarn daemons. so Tajo
> has
> >> >> his
> >> >> >> own
> >> >> >> > > job
> >> >> >> > > >>> > > > sheduler ?
> >> >> >> > > >>> > > >
> >> >> >> > > >>> > > >
> >> >> >> > > >>> > > Now, tajo does using own task scheduler. and  You can
> >> start
> >> >> >> tajo
> >> >> >> > > >>> without
> >> >> >> > > >>> > > Yarn daemons
> >> >> >> > > >>> > > Please refer to
> >> >> >> > > http://tajo.apache.org/docs/0.9.0/configuration.html
> >> >> >> > > >>> > >
> >> >> >> > > >>> > >
> >> >> >> > > >>> > > >
> >> >> >> > > >>> > > > 2) Does that we need to put the file replications on
> >> every
> >> >> >> > nodes
> >> >> >> > > on
> >> >> >> > > >>> > Tajo
> >> >> >> > > >>> > > > cluster?
> >> >> >> > > >>> > > >
> >> >> >> > > >>> > >
> >> >> >> > > >>> > > No, tajo does not need more replication.  if you set
> more
> >> >> >> > > >>> replication,
> >> >> >> > > >>> > data
> >> >> >> > > >>> > > locality can be increased
> >> >> >> > > >>> > >
> >> >> >> > > >>> > > such as I have a six nodes Tajo cluster, then should I
> >> set
> >> >> HDFS
> >> >> >> > > block
> >> >> >> > > >>> > > > replication to six? because:
> >> >> >> > > >>> > > >
> >> >> >> > > >>> > > > I noticed when I run Tajo query, some nodes are
> busy,
> >> but
> >> >> >> some
> >> >> >> > is
> >> >> >> > > >>> free.
> >> >> >> > > >>> > > > because the file's blocks are only located on these
> >> nodes.
> >> >> >> non
> >> >> >> > > >>> others.
> >> >> >> > > >>> > > >
> >> >> >> > > >>> > > >
> >> >> >> > > >>> > > In my opinion, you need to run balancer
> >> >> >> > > >>> > >
> >> >> >> > > >>> > >
> >> >> >> > > >>> >
> >> >> >> > > >>>
> >> >> >> > >
> >> >> >> >
> >> >> >>
> >> >>
> >>
> http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/HDFSCommands.html#balancer
> >> >> >> > > >>> > >
> >> >> >> > > >>> > >
> >> >> >> > > >>> > > 3)the test data set is 4 million rows. nearly several
> GB.
> >> >> but
> >> >> >> > it's
> >> >> >> > > >>> very
> >> >> >> > > >>> > > > slow when I runing: select count(distinct ID) from
> >> ****;
> >> >> >> > > >>> > > > Any possible problems here?
> >> >> >> > > >>> > > >
> >> >> >> > > >>> > >
> >> >> >> > > >>> > > Could you share tajo-env.sh, tajo-site.xml ?
> >> >> >> > > >>> > >
> >> >> >> > > >>> > >
> >> >> >> > > >>> > > >
> >> >> >> > > >>> > > >
> >> >> >> > > >>> > > > Thanks
> >> >> >> > > >>> > > >
> >> >> >> > > >>> > >
> >> >> >> > > >>> >
> >> >> >> > > >>>
> >> >> >> > > >>
> >> >> >> > > >>
> >> >> >> > > >
> >> >> >> > >
> >> >> >> >
> >> >> >>
> >> >>
> >>
>

Re: Some Tajo-0.9.0 questions

Posted by Hyunsik Choi <hy...@apache.org>.

Hi Azuryy,

I just know more the best practices. Since the resource model of Tajo
is being evolved rapidly, the current model is not completed and is
not intuitive in my opinion. So, you may be hard to set the best
resources configs to Tajo.

Even though the resource model includes the number of CPU cores,
memory size, and disk resources, its essential thing is to determine
the number of concurrent tasks in each machine.

The proper number of concurrent tasks can be derived from hardware
resource capacity. The followings come from my experiences:

 * (Disk) Performance has been the best when each SATA disk takes 2
concurrent tasks at the same time.
 * (CPU) Performance has been the best in most cases when the
concurrent tasks = the number of cores
 * (Memory) Each task consumes 2 ~ 4 GB. Memory consumption for each
task varies in workloads. More than 2 GB has been showed stable and
efficient performance even in heavy workloads and long time workloads.

Note that the smallest number of concurrent tasks derived from tree
kinds of hardware capacity determines the best concurrent tasks in the
machine. For example, consider one machine which is equipped with 2
CPU cores and 12 disks. In this cases, 2 CPU will determine the best
number of concurrent tasks in the machine because 2 concurrent tasks
of CPU cores is fewer than 24 concurrent tasks of 12 disks.

Let's get back to the discussion of your machine. If we assume that
the entire resources of your physical node (24cpu, 64G mem,  4T*12
HDD) are used for a Tajo, I would think as follows:

 * 24 concurrent tasks is the best in 24 cores.
 * 24 tasks would be the best in 12 disks if the disks are SATA. If
they are SAS disks, each disk can accept more than 2. 36 may be the
best for 12 disks.
 * Memory seems to be most scarce resource in all resources. OS and
other daemons should use few GBs. So, I assume only 60 GBs may be
available for Tajo. I think that 3GB seems to be the best for each
task. In terms of memory size, 20 concurrent tasks (60 GB / 3GB) in
the machine seems to be proper number.

The lowest concurrency number is 20 by 64GB memory. So, the following
configs may be proper to your physical node.

tajo-site.xml
<!--  worker  -->
<property>
  <name>tajo.worker.resource.memory-mb</name>
  <value>60512</value> <!--  20 tasks + 1 qm (i.e, 3000 * 20 + 512 * 1)  -->
</property>
<property>
  <name>tajo.worker.resource.disks</name>
  <value>20</value> <!--  20 tasks (20 * 1.0f)  -->
</property>

<property>
  <name>tajo.task.memory-slot-mb.default</name>
  <value>3000</value> <!--  default 512 -->
</property>
<property>
  <name>tajo.task.disk-slot.default</name>
  <value>1.0f</value> <!--  default 0.5 -->
</property>


tajo-env.sh
TAJO_WORKER_HEAPSIZE=60000

The above config is for a single physical node dedicated for a Tajo
worker. The configs for virtual machine may be different because each
of them use OS and other daemons.

I hope that my suggestion would be helpful to you.

Best regards,
Hyunsik

On Mon, Jan 19, 2015 at 5:07 PM, Azuryy Yu <az...@gmail.com> wrote:
> Yes Hyunsik,
>
> but that's all I know from the Tajo website. I guess there are more default
> configurations but not showed on the Tajo wiki, right?
>
>
>
>
> On Mon, Jan 19, 2015 at 4:52 PM, Hyunsik Choi <hy...@apache.org> wrote:
>
>> Thank you for sharing the machine information.
>>
>> In my opinion, we can boost up Tajo performance very much in the
>> machine with proper configuration if the server is dedicated for Tajo.
>> I think that the configuration that we mentioned above only uses some
>> of the physical resources in the machine :)
>>
>> Warm regards,
>> Hyunsik
>>
>> On Sun, Jan 18, 2015 at 8:10 PM, Azuryy Yu <az...@gmail.com> wrote:
>> > Thanks Tyunsik.
>> >
>> > I asked our infra team, my 6 nodes Tajo cluster were visulized from one
>> > host. that's mean I run 6 nodes Tajo cluster on one phisical host.(24cpu,
>> > 64G mem,  4T*12 HDD)
>> >
>> > so I think this was the real performance bottle neck.
>> >
>> >
>> >
>> > On Mon, Jan 19, 2015 at 11:12 AM, Hyunsik Choi <hy...@apache.org>
>> wrote:
>> >
>> >> Hi Azuryy,
>> >>
>> >> Tajo automatically rewrites distinct aggregation queries into
>> >> multi-level aggregations. The query rewrite that Jinho suggested may
>> >> be already involved.
>> >>
>> >> I think that your query response times (12 ~ 15 secs) for distinct
>> >> count seems to be reasonable because just count aggregation takes 5
>> >> secs. Usually, distinct aggregation queries are much more slower than
>> >> just aggregation queries because distinct aggregation involves sort,
>> >> large intermediate data, and only distinct value handling.
>> >>
>> >> In addition, I have a question for more better configuration guide.
>> >> Could you share available CPU, memory and disks for Tajo?
>> >>
>> >> Even though Jinho suggested one, there is still room to set exact and
>> >> better configurations. Since the resource configuration determines the
>> >> number of concurrent tasks, it may be main cause of your performance
>> >> problem.
>> >>
>> >> Best regards,
>> >> Hyunsik
>> >>
>> >> On Sun, Jan 18, 2015 at 6:54 PM, Jinho Kim <jh...@apache.org> wrote:
>> >> >  Sorry for my mistake example query.
>> >> > Can you change to “select count(a.auid) from ( select auid from
>> >> > test_pl_00_0 group by auid ) a;” ?
>> >> >
>> >> > -Jinho
>> >> > Best regards
>> >> >
>> >> > 2015-01-19 11:44 GMT+09:00 Azuryy Yu <az...@gmail.com>:
>> >> >
>> >> >> Sorry for no response during weekend.
>> >> >> I changed hdfs-site.xml and restart hdfs and tajo.but  It's more slow
>> >> than
>> >> >> before.
>> >> >>
>> >> >> default> select count(a.auid) from ( select auid from test_pl_00_0 )
>> a;
>> >> >> Progress: 0%, response time: 1.132 sec
>> >> >> Progress: 0%, response time: 1.134 sec
>> >> >> Progress: 0%, response time: 1.536 sec
>> >> >> Progress: 0%, response time: 2.338 sec
>> >> >> Progress: 0%, response time: 3.341 sec
>> >> >> Progress: 3%, response time: 4.343 sec
>> >> >> Progress: 4%, response time: 5.346 sec
>> >> >> Progress: 9%, response time: 6.35 sec
>> >> >> Progress: 11%, response time: 7.352 sec
>> >> >> Progress: 16%, response time: 8.354 sec
>> >> >> Progress: 18%, response time: 9.362 sec
>> >> >> Progress: 24%, response time: 10.364 sec
>> >> >> Progress: 27%, response time: 11.366 sec
>> >> >> Progress: 29%, response time: 12.368 sec
>> >> >> Progress: 32%, response time: 13.37 sec
>> >> >> Progress: 37%, response time: 14.373 sec
>> >> >> Progress: 40%, response time: 15.377 sec
>> >> >> Progress: 42%, response time: 16.379 sec
>> >> >> Progress: 42%, response time: 17.382 sec
>> >> >> Progress: 43%, response time: 18.384 sec
>> >> >> Progress: 43%, response time: 19.386 sec
>> >> >> Progress: 45%, response time: 20.388 sec
>> >> >> Progress: 45%, response time: 21.391 sec
>> >> >> Progress: 46%, response time: 22.393 sec
>> >> >> Progress: 46%, response time: 23.395 sec
>> >> >> Progress: 48%, response time: 24.398 sec
>> >> >> Progress: 48%, response time: 25.401 sec
>> >> >> Progress: 50%, response time: 26.403 sec
>> >> >> Progress: 100%, response time: 26.95 sec
>> >> >> ?count
>> >> >> -------------------------------
>> >> >> 4487999
>> >> >> (1 rows, 26.95 sec, 8 B selected)
>> >> >> default> select count(distinct auid) from test_pl_00_0;
>> >> >> Progress: 0%, response time: 0.88 sec
>> >> >> Progress: 0%, response time: 0.881 sec
>> >> >> Progress: 0%, response time: 1.283 sec
>> >> >> Progress: 0%, response time: 2.086 sec
>> >> >> Progress: 0%, response time: 3.088 sec
>> >> >> Progress: 0%, response time: 4.09 sec
>> >> >> Progress: 25%, response time: 5.092 sec
>> >> >> Progress: 33%, response time: 6.094 sec
>> >> >> Progress: 50%, response time: 7.096 sec
>> >> >> Progress: 50%, response time: 8.098 sec
>> >> >> Progress: 50%, response time: 9.099 sec
>> >> >> Progress: 66%, response time: 10.101 sec
>> >> >> Progress: 66%, response time: 11.103 sec
>> >> >> Progress: 83%, response time: 12.105 sec
>> >> >> Progress: 100%, response time: 12.268 sec
>> >> >> ?count
>> >> >> -------------------------------
>> >> >> 1222356
>> >> >> (1 rows, 12.268 sec, 8 B selected)
>> >> >>
>> >> >> On Sat, Jan 17, 2015 at 11:00 PM, Jinho Kim <jh...@apache.org>
>> wrote:
>> >> >>
>> >> >> >  Thank you for your sharing
>> >> >> >
>> >> >> > Can you enable the dfs.datanode.hdfs-blocks-metadata.enabled in
>> >> >> > hdfs-site.xml ?
>> >> >> > If you enable the block-metadata, tajo-cluster can use the volume
>> load
>> >> >> > balancing. You should restart the datanode and tajo cluster. I will
>> >> >> > investigate performance of count-distinct operator. and You can
>> >> change to
>> >> >> > “select count(a.auid) from ( select auid from test_pl_00_0 ) a”
>> >> >> >
>> >> >> >
>> >> >> > -Jinho
>> >> >> > Best regards
>> >> >> >
>> >> >> > 2015-01-16 18:05 GMT+09:00 Azuryy Yu <az...@gmail.com>:
>> >> >> >
>> >> >> > > default> select count(*) from test_pl_00_0;
>> >> >> > > Progress: 0%, response time: 0.718 sec
>> >> >> > > Progress: 0%, response time: 0.72 sec
>> >> >> > > Progress: 0%, response time: 1.121 sec
>> >> >> > > Progress: 12%, response time: 1.923 sec
>> >> >> > > Progress: 28%, response time: 2.925 sec
>> >> >> > > Progress: 41%, response time: 3.927 sec
>> >> >> > > Progress: 50%, response time: 4.931 sec
>> >> >> > > Progress: 100%, response time: 5.323 sec
>> >> >> > > 2015-01-16T17:04:41.116+0800: [GC2015-01-16T17:04:41.116+0800:
>> >> [ParNew:
>> >> >> > > 26543K->6211K(31488K), 0.0079770 secs] 26543K->6211K(115456K),
>> >> >> 0.0080700
>> >> >> > > secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
>> >> >> > > 2015-01-16T17:04:41.303+0800: [GC2015-01-16T17:04:41.303+0800:
>> >> [ParNew:
>> >> >> > > 27203K->7185K(31488K), 0.0066950 secs] 27203K->7185K(115456K),
>> >> >> 0.0068130
>> >> >> > > secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
>> >> >> > > 2015-01-16T17:04:41.504+0800: [GC2015-01-16T17:04:41.504+0800:
>> >> [ParNew:
>> >> >> > > 28177K->5597K(31488K), 0.0091630 secs] 28177K->6523K(115456K),
>> >> >> 0.0092430
>> >> >> > > secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
>> >> >> > > 2015-01-16T17:04:41.778+0800: [GC2015-01-16T17:04:41.778+0800:
>> >> [ParNew:
>> >> >> > > 26589K->6837K(31488K), 0.0067280 secs] 27515K->7764K(115456K),
>> >> >> 0.0068160
>> >> >> > > secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
>> >> >> > > ?count
>> >> >> > > -------------------------------
>> >> >> > > 4487999
>> >> >> > > (1 rows, 5.323 sec, 8 B selected)
>> >> >> > >
>> >> >> > > On Fri, Jan 16, 2015 at 5:03 PM, Azuryy Yu <az...@gmail.com>
>> >> wrote:
>> >> >> > >
>> >> >> > > > Hi,
>> >> >> > > > There is no big improvement, sometimes more slower than
>> before. I
>> >> >> also
>> >> >> > > try
>> >> >> > > > to increase worker's heap size and parallel, nothing improve.
>> >> >> > > >
>> >> >> > > > default> select count(distinct auid) from test_pl_00_0;
>> >> >> > > > Progress: 0%, response time: 0.963 sec
>> >> >> > > > Progress: 0%, response time: 0.964 sec
>> >> >> > > > Progress: 0%, response time: 1.366 sec
>> >> >> > > > Progress: 0%, response time: 2.168 sec
>> >> >> > > > Progress: 0%, response time: 3.17 sec
>> >> >> > > > Progress: 0%, response time: 4.172 sec
>> >> >> > > > Progress: 16%, response time: 5.174 sec
>> >> >> > > > Progress: 16%, response time: 6.176 sec
>> >> >> > > > Progress: 16%, response time: 7.178 sec
>> >> >> > > > Progress: 33%, response time: 8.18 sec
>> >> >> > > > Progress: 50%, response time: 9.181 sec
>> >> >> > > > Progress: 50%, response time: 10.183 sec
>> >> >> > > > Progress: 50%, response time: 11.185 sec
>> >> >> > > > Progress: 50%, response time: 12.187 sec
>> >> >> > > > Progress: 66%, response time: 13.189 sec
>> >> >> > > > Progress: 66%, response time: 14.19 sec
>> >> >> > > > Progress: 100%, response time: 15.003 sec
>> >> >> > > > 2015-01-16T17:00:56.410+0800: [GC2015-01-16T17:00:56.410+0800:
>> >> >> [ParNew:
>> >> >> > > > 26473K->6582K(31488K), 0.0105030 secs] 26473K->6582K(115456K),
>> >> >> > 0.0105720
>> >> >> > > > secs] [Times: user=0.04 sys=0.00, real=0.01 secs]
>> >> >> > > > 2015-01-16T17:00:56.593+0800: [GC2015-01-16T17:00:56.593+0800:
>> >> >> [ParNew:
>> >> >> > > > 27574K->6469K(31488K), 0.0086300 secs] 27574K->6469K(115456K),
>> >> >> > 0.0086940
>> >> >> > > > secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
>> >> >> > > > 2015-01-16T17:00:56.800+0800: [GC2015-01-16T17:00:56.800+0800:
>> >> >> [ParNew:
>> >> >> > > > 27461K->5664K(31488K), 0.0122560 secs] 27461K->6591K(115456K),
>> >> >> > 0.0123210
>> >> >> > > > secs] [Times: user=0.02 sys=0.01, real=0.01 secs]
>> >> >> > > > 2015-01-16T17:00:57.065+0800: [GC2015-01-16T17:00:57.065+0800:
>> >> >> [ParNew:
>> >> >> > > > 26656K->6906K(31488K), 0.0070520 secs] 27583K->7833K(115456K),
>> >> >> > 0.0071470
>> >> >> > > > secs] [Times: user=0.03 sys=0.00, real=0.01 secs]
>> >> >> > > > ?count
>> >> >> > > > -------------------------------
>> >> >> > > > 1222356
>> >> >> > > > (1 rows, 15.003 sec, 8 B selected)
>> >> >> > > >
>> >> >> > > >
>> >> >> > > > On Fri, Jan 16, 2015 at 4:09 PM, Azuryy Yu <azuryyyu@gmail.com
>> >
>> >> >> wrote:
>> >> >> > > >
>> >> >> > > >> Thanks Kim, I'll try and post back.
>> >> >> > > >>
>> >> >> > > >> On Fri, Jan 16, 2015 at 4:02 PM, Jinho Kim <jh...@apache.org>
>> >> >> wrote:
>> >> >> > > >>
>> >> >> > > >>> Thanks Azuryy Yu
>> >> >> > > >>>
>> >> >> > > >>> Your parallel running tasks of tajo-worker is 10 but heap
>> >> memory is
>> >> >> > > 3GB.
>> >> >> > > >>> It
>> >> >> > > >>> cause a long JVM pause
>> >> >> > > >>> I recommend following :
>> >> >> > > >>>
>> >> >> > > >>> tajo-env.sh
>> >> >> > > >>> TAJO_WORKER_HEAPSIZE=3000 or more
>> >> >> > > >>>
>> >> >> > > >>> tajo-site.xml
>> >> >> > > >>> <!--  worker  -->
>> >> >> > > >>> <property>
>> >> >> > > >>>   <name>tajo.worker.resource.memory-mb</name>
>> >> >> > > >>>   <value>3512</value> <!--  3 tasks + 1 qm task  -->
>> >> >> > > >>> </property>
>> >> >> > > >>> <property>
>> >> >> > > >>>   <name>tajo.task.memory-slot-mb.default</name>
>> >> >> > > >>>   <value>1000</value> <!--  default 512 -->
>> >> >> > > >>> </property>
>> >> >> > > >>> <property>
>> >> >> > > >>>    <name>tajo.worker.resource.dfs-dir-aware</name>
>> >> >> > > >>>    <value>true</value>
>> >> >> > > >>> </property>
>> >> >> > > >>> <!--  end  -->
>> >> >> > > >>>
>> >> >> > >
>> >> >> >
>> >> >>
>> >>
>> http://tajo.apache.org/docs/0.9.0/configuration/worker_configuration.html
>> >> >> > > >>>
>> >> >> > > >>> -Jinho
>> >> >> > > >>> Best regards
>> >> >> > > >>>
>> >> >> > > >>> 2015-01-16 16:02 GMT+09:00 Azuryy Yu <az...@gmail.com>:
>> >> >> > > >>>
>> >> >> > > >>> > Thanks Kim.
>> >> >> > > >>> >
>> >> >> > > >>> > The following is my tajo-env and tajo-site
>> >> >> > > >>> >
>> >> >> > > >>> > *tajo-env.sh:*
>> >> >> > > >>> > export HADOOP_HOME=/usr/local/hadoop
>> >> >> > > >>> > export JAVA_HOME=/usr/local/java
>> >> >> > > >>> > _TAJO_OPTS="-server -verbose:gc
>> >> >> > > >>> >   -XX:+PrintGCDateStamps
>> >> >> > > >>> >   -XX:+PrintGCDetails
>> >> >> > > >>> >   -XX:+UseGCLogFileRotation
>> >> >> > > >>> >   -XX:NumberOfGCLogFiles=9
>> >> >> > > >>> >   -XX:GCLogFileSize=256m
>> >> >> > > >>> >   -XX:+DisableExplicitGC
>> >> >> > > >>> >   -XX:+UseCompressedOops
>> >> >> > > >>> >   -XX:SoftRefLRUPolicyMSPerMB=0
>> >> >> > > >>> >   -XX:+UseFastAccessorMethods
>> >> >> > > >>> >   -XX:+UseParNewGC
>> >> >> > > >>> >   -XX:+UseConcMarkSweepGC
>> >> >> > > >>> >   -XX:+CMSParallelRemarkEnabled
>> >> >> > > >>> >   -XX:CMSInitiatingOccupancyFraction=70
>> >> >> > > >>> >   -XX:+UseCMSCompactAtFullCollection
>> >> >> > > >>> >   -XX:CMSFullGCsBeforeCompaction=0
>> >> >> > > >>> >   -XX:+CMSClassUnloadingEnabled
>> >> >> > > >>> >   -XX:CMSMaxAbortablePrecleanTime=300
>> >> >> > > >>> >   -XX:+CMSScavengeBeforeRemark
>> >> >> > > >>> >   -XX:PermSize=160m
>> >> >> > > >>> >   -XX:GCTimeRatio=19
>> >> >> > > >>> >   -XX:SurvivorRatio=2
>> >> >> > > >>> >   -XX:MaxTenuringThreshold=60"
>> >> >> > > >>> > _TAJO_MASTER_OPTS="$_TAJO_OPTS -Xmx512m -Xms512m -Xmn256m"
>> >> >> > > >>> > _TAJO_WORKER_OPTS="$_TAJO_OPTS -Xmx3g -Xms3g -Xmn1g"
>> >> >> > > >>> > _TAJO_QUERYMASTER_OPTS="$_TAJO_OPTS -Xmx512m -Xms512m
>> >> -Xmn256m"
>> >> >> > > >>> > export TAJO_OPTS=$_TAJO_OPTS
>> >> >> > > >>> > export TAJO_MASTER_OPTS=$_TAJO_MASTER_OPTS
>> >> >> > > >>> > export TAJO_WORKER_OPTS=$_TAJO_WORKER_OPTS
>> >> >> > > >>> > export TAJO_QUERYMASTER_OPTS=$_TAJO_QUERYMASTER_OPTS
>> >> >> > > >>> > export TAJO_LOG_DIR=${TAJO_HOME}/logs
>> >> >> > > >>> > export TAJO_PID_DIR=${TAJO_HOME}/pids
>> >> >> > > >>> > export TAJO_WORKER_STANDBY_MODE=true
>> >> >> > > >>> >
>> >> >> > > >>> > *tajo-site.xml:*
>> >> >> > > >>> >
>> >> >> > > >>> > <configuration>
>> >> >> > > >>> >   <property>
>> >> >> > > >>> >     <name>tajo.rootdir</name>
>> >> >> > > >>> >     <value>hdfs://test-cluster/tajo</value>
>> >> >> > > >>> >   </property>
>> >> >> > > >>> >   <property>
>> >> >> > > >>> >     <name>tajo.master.umbilical-rpc.address</name>
>> >> >> > > >>> >     <value>10-0-86-51:26001</value>
>> >> >> > > >>> >   </property>
>> >> >> > > >>> >   <property>
>> >> >> > > >>> >     <name>tajo.master.client-rpc.address</name>
>> >> >> > > >>> >     <value>10-0-86-51:26002</value>
>> >> >> > > >>> >   </property>
>> >> >> > > >>> >   <property>
>> >> >> > > >>> >     <name>tajo.resource-tracker.rpc.address</name>
>> >> >> > > >>> >     <value>10-0-86-51:26003</value>
>> >> >> > > >>> >   </property>
>> >> >> > > >>> >   <property>
>> >> >> > > >>> >     <name>tajo.catalog.client-rpc.address</name>
>> >> >> > > >>> >     <value>10-0-86-51:26005</value>
>> >> >> > > >>> >   </property>
>> >> >> > > >>> >   <property>
>> >> >> > > >>> >     <name>tajo.worker.tmpdir.locations</name>
>> >> >> > > >>> >     <value>/test/tajo1,/test/tajo2,/test/tajo3</value>
>> >> >> > > >>> >   </property>
>> >> >> > > >>> >   <!--  worker  -->
>> >> >> > > >>> >   <property>
>> >> >> > > >>> >
>> >> >> >  <name>tajo.worker.resource.tajo.worker.resource.cpu-cores</name>
>> >> >> > > >>> >     <value>4</value>
>> >> >> > > >>> >   </property>
>> >> >> > > >>> >  <property>
>> >> >> > > >>> >    <name>tajo.worker.resource.memory-mb</name>
>> >> >> > > >>> >    <value>5120</value>
>> >> >> > > >>> >  </property>
>> >> >> > > >>> >   <property>
>> >> >> > > >>> >     <name>tajo.worker.resource.dfs-dir-aware</name>
>> >> >> > > >>> >     <value>true</value>
>> >> >> > > >>> >   </property>
>> >> >> > > >>> >   <property>
>> >> >> > > >>> >     <name>tajo.worker.resource.dedicated</name>
>> >> >> > > >>> >     <value>true</value>
>> >> >> > > >>> >   </property>
>> >> >> > > >>> >   <property>
>> >> >> > > >>> >
>>  <name>tajo.worker.resource.dedicated-memory-ratio</name>
>> >> >> > > >>> >     <value>0.6</value>
>> >> >> > > >>> >   </property>
>> >> >> > > >>> > </configuration>
>> >> >> > > >>> >
>> >> >> > > >>> > On Fri, Jan 16, 2015 at 2:50 PM, Jinho Kim <
>> jhkim@apache.org>
>> >> >> > wrote:
>> >> >> > > >>> >
>> >> >> > > >>> > > Hello Azuyy yu
>> >> >> > > >>> > >
>> >> >> > > >>> > > I left some comments.
>> >> >> > > >>> > >
>> >> >> > > >>> > > -Jinho
>> >> >> > > >>> > > Best regards
>> >> >> > > >>> > >
>> >> >> > > >>> > > 2015-01-16 14:37 GMT+09:00 Azuryy Yu <azuryyyu@gmail.com
>> >:
>> >> >> > > >>> > >
>> >> >> > > >>> > > > Hi,
>> >> >> > > >>> > > >
>> >> >> > > >>> > > > I tested Tajo before half a year, then not focus on
>> Tajo
>> >> >> > because
>> >> >> > > >>> some
>> >> >> > > >>> > > other
>> >> >> > > >>> > > > works.
>> >> >> > > >>> > > >
>> >> >> > > >>> > > > then I setup a small dev Tajo cluster this week.(six
>> >> nodes,
>> >> >> VM)
>> >> >> > > >>> based
>> >> >> > > >>> > on
>> >> >> > > >>> > > > Hadoop-2.6.0.
>> >> >> > > >>> > > >
>> >> >> > > >>> > > > so my questions is:
>> >> >> > > >>> > > >
>> >> >> > > >>> > > > 1) From I know half a yea ago, Tajo is work on Yarn,
>> using
>> >> >> Yarn
>> >> >> > > >>> > scheduler
>> >> >> > > >>> > > > to manage  job resources. but now I found it doesn't
>> rely
>> >> on
>> >> >> > > Yarn,
>> >> >> > > >>> > > because
>> >> >> > > >>> > > > I only start HDFS daemons, no yarn daemons. so Tajo has
>> >> his
>> >> >> own
>> >> >> > > job
>> >> >> > > >>> > > > sheduler ?
>> >> >> > > >>> > > >
>> >> >> > > >>> > > >
>> >> >> > > >>> > > Now, tajo does using own task scheduler. and  You can
>> start
>> >> >> tajo
>> >> >> > > >>> without
>> >> >> > > >>> > > Yarn daemons
>> >> >> > > >>> > > Please refer to
>> >> >> > > http://tajo.apache.org/docs/0.9.0/configuration.html
>> >> >> > > >>> > >
>> >> >> > > >>> > >
>> >> >> > > >>> > > >
>> >> >> > > >>> > > > 2) Does that we need to put the file replications on
>> every
>> >> >> > nodes
>> >> >> > > on
>> >> >> > > >>> > Tajo
>> >> >> > > >>> > > > cluster?
>> >> >> > > >>> > > >
>> >> >> > > >>> > >
>> >> >> > > >>> > > No, tajo does not need more replication.  if you set more
>> >> >> > > >>> replication,
>> >> >> > > >>> > data
>> >> >> > > >>> > > locality can be increased
>> >> >> > > >>> > >
>> >> >> > > >>> > > such as I have a six nodes Tajo cluster, then should I
>> set
>> >> HDFS
>> >> >> > > block
>> >> >> > > >>> > > > replication to six? because:
>> >> >> > > >>> > > >
>> >> >> > > >>> > > > I noticed when I run Tajo query, some nodes are busy,
>> but
>> >> >> some
>> >> >> > is
>> >> >> > > >>> free.
>> >> >> > > >>> > > > because the file's blocks are only located on these
>> nodes.
>> >> >> non
>> >> >> > > >>> others.
>> >> >> > > >>> > > >
>> >> >> > > >>> > > >
>> >> >> > > >>> > > In my opinion, you need to run balancer
>> >> >> > > >>> > >
>> >> >> > > >>> > >
>> >> >> > > >>> >
>> >> >> > > >>>
>> >> >> > >
>> >> >> >
>> >> >>
>> >>
>> http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/HDFSCommands.html#balancer
>> >> >> > > >>> > >
>> >> >> > > >>> > >
>> >> >> > > >>> > > 3)the test data set is 4 million rows. nearly several GB.
>> >> but
>> >> >> > it's
>> >> >> > > >>> very
>> >> >> > > >>> > > > slow when I runing: select count(distinct ID) from
>> ****;
>> >> >> > > >>> > > > Any possible problems here?
>> >> >> > > >>> > > >
>> >> >> > > >>> > >
>> >> >> > > >>> > > Could you share tajo-env.sh, tajo-site.xml ?
>> >> >> > > >>> > >
>> >> >> > > >>> > >
>> >> >> > > >>> > > >
>> >> >> > > >>> > > >
>> >> >> > > >>> > > > Thanks
>> >> >> > > >>> > > >
>> >> >> > > >>> > >
>> >> >> > > >>> >
>> >> >> > > >>>
>> >> >> > > >>
>> >> >> > > >>
>> >> >> > > >
>> >> >> > >
>> >> >> >
>> >> >>
>> >>
>>

Re: Some Tajo-0.9.0 questions

Posted by Azuryy Yu <az...@gmail.com>.

Yes Hyunsik,

but that's all I know from the Tajo website. I guess there are more default
configurations but not showed on the Tajo wiki, right?




On Mon, Jan 19, 2015 at 4:52 PM, Hyunsik Choi <hy...@apache.org> wrote:

> Thank you for sharing the machine information.
>
> In my opinion, we can boost up Tajo performance very much in the
> machine with proper configuration if the server is dedicated for Tajo.
> I think that the configuration that we mentioned above only uses some
> of the physical resources in the machine :)
>
> Warm regards,
> Hyunsik
>
> On Sun, Jan 18, 2015 at 8:10 PM, Azuryy Yu <az...@gmail.com> wrote:
> > Thanks Tyunsik.
> >
> > I asked our infra team, my 6 nodes Tajo cluster were visulized from one
> > host. that's mean I run 6 nodes Tajo cluster on one phisical host.(24cpu,
> > 64G mem,  4T*12 HDD)
> >
> > so I think this was the real performance bottle neck.
> >
> >
> >
> > On Mon, Jan 19, 2015 at 11:12 AM, Hyunsik Choi <hy...@apache.org>
> wrote:
> >
> >> Hi Azuryy,
> >>
> >> Tajo automatically rewrites distinct aggregation queries into
> >> multi-level aggregations. The query rewrite that Jinho suggested may
> >> be already involved.
> >>
> >> I think that your query response times (12 ~ 15 secs) for distinct
> >> count seems to be reasonable because just count aggregation takes 5
> >> secs. Usually, distinct aggregation queries are much more slower than
> >> just aggregation queries because distinct aggregation involves sort,
> >> large intermediate data, and only distinct value handling.
> >>
> >> In addition, I have a question for more better configuration guide.
> >> Could you share available CPU, memory and disks for Tajo?
> >>
> >> Even though Jinho suggested one, there is still room to set exact and
> >> better configurations. Since the resource configuration determines the
> >> number of concurrent tasks, it may be main cause of your performance
> >> problem.
> >>
> >> Best regards,
> >> Hyunsik
> >>
> >> On Sun, Jan 18, 2015 at 6:54 PM, Jinho Kim <jh...@apache.org> wrote:
> >> >  Sorry for my mistake example query.
> >> > Can you change to “select count(a.auid) from ( select auid from
> >> > test_pl_00_0 group by auid ) a;” ?
> >> >
> >> > -Jinho
> >> > Best regards
> >> >
> >> > 2015-01-19 11:44 GMT+09:00 Azuryy Yu <az...@gmail.com>:
> >> >
> >> >> Sorry for no response during weekend.
> >> >> I changed hdfs-site.xml and restart hdfs and tajo.but  It's more slow
> >> than
> >> >> before.
> >> >>
> >> >> default> select count(a.auid) from ( select auid from test_pl_00_0 )
> a;
> >> >> Progress: 0%, response time: 1.132 sec
> >> >> Progress: 0%, response time: 1.134 sec
> >> >> Progress: 0%, response time: 1.536 sec
> >> >> Progress: 0%, response time: 2.338 sec
> >> >> Progress: 0%, response time: 3.341 sec
> >> >> Progress: 3%, response time: 4.343 sec
> >> >> Progress: 4%, response time: 5.346 sec
> >> >> Progress: 9%, response time: 6.35 sec
> >> >> Progress: 11%, response time: 7.352 sec
> >> >> Progress: 16%, response time: 8.354 sec
> >> >> Progress: 18%, response time: 9.362 sec
> >> >> Progress: 24%, response time: 10.364 sec
> >> >> Progress: 27%, response time: 11.366 sec
> >> >> Progress: 29%, response time: 12.368 sec
> >> >> Progress: 32%, response time: 13.37 sec
> >> >> Progress: 37%, response time: 14.373 sec
> >> >> Progress: 40%, response time: 15.377 sec
> >> >> Progress: 42%, response time: 16.379 sec
> >> >> Progress: 42%, response time: 17.382 sec
> >> >> Progress: 43%, response time: 18.384 sec
> >> >> Progress: 43%, response time: 19.386 sec
> >> >> Progress: 45%, response time: 20.388 sec
> >> >> Progress: 45%, response time: 21.391 sec
> >> >> Progress: 46%, response time: 22.393 sec
> >> >> Progress: 46%, response time: 23.395 sec
> >> >> Progress: 48%, response time: 24.398 sec
> >> >> Progress: 48%, response time: 25.401 sec
> >> >> Progress: 50%, response time: 26.403 sec
> >> >> Progress: 100%, response time: 26.95 sec
> >> >> ?count
> >> >> -------------------------------
> >> >> 4487999
> >> >> (1 rows, 26.95 sec, 8 B selected)
> >> >> default> select count(distinct auid) from test_pl_00_0;
> >> >> Progress: 0%, response time: 0.88 sec
> >> >> Progress: 0%, response time: 0.881 sec
> >> >> Progress: 0%, response time: 1.283 sec
> >> >> Progress: 0%, response time: 2.086 sec
> >> >> Progress: 0%, response time: 3.088 sec
> >> >> Progress: 0%, response time: 4.09 sec
> >> >> Progress: 25%, response time: 5.092 sec
> >> >> Progress: 33%, response time: 6.094 sec
> >> >> Progress: 50%, response time: 7.096 sec
> >> >> Progress: 50%, response time: 8.098 sec
> >> >> Progress: 50%, response time: 9.099 sec
> >> >> Progress: 66%, response time: 10.101 sec
> >> >> Progress: 66%, response time: 11.103 sec
> >> >> Progress: 83%, response time: 12.105 sec
> >> >> Progress: 100%, response time: 12.268 sec
> >> >> ?count
> >> >> -------------------------------
> >> >> 1222356
> >> >> (1 rows, 12.268 sec, 8 B selected)
> >> >>
> >> >> On Sat, Jan 17, 2015 at 11:00 PM, Jinho Kim <jh...@apache.org>
> wrote:
> >> >>
> >> >> >  Thank you for your sharing
> >> >> >
> >> >> > Can you enable the dfs.datanode.hdfs-blocks-metadata.enabled in
> >> >> > hdfs-site.xml ?
> >> >> > If you enable the block-metadata, tajo-cluster can use the volume
> load
> >> >> > balancing. You should restart the datanode and tajo cluster. I will
> >> >> > investigate performance of count-distinct operator. and You can
> >> change to
> >> >> > “select count(a.auid) from ( select auid from test_pl_00_0 ) a”
> >> >> >
> >> >> >
> >> >> > -Jinho
> >> >> > Best regards
> >> >> >
> >> >> > 2015-01-16 18:05 GMT+09:00 Azuryy Yu <az...@gmail.com>:
> >> >> >
> >> >> > > default> select count(*) from test_pl_00_0;
> >> >> > > Progress: 0%, response time: 0.718 sec
> >> >> > > Progress: 0%, response time: 0.72 sec
> >> >> > > Progress: 0%, response time: 1.121 sec
> >> >> > > Progress: 12%, response time: 1.923 sec
> >> >> > > Progress: 28%, response time: 2.925 sec
> >> >> > > Progress: 41%, response time: 3.927 sec
> >> >> > > Progress: 50%, response time: 4.931 sec
> >> >> > > Progress: 100%, response time: 5.323 sec
> >> >> > > 2015-01-16T17:04:41.116+0800: [GC2015-01-16T17:04:41.116+0800:
> >> [ParNew:
> >> >> > > 26543K->6211K(31488K), 0.0079770 secs] 26543K->6211K(115456K),
> >> >> 0.0080700
> >> >> > > secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
> >> >> > > 2015-01-16T17:04:41.303+0800: [GC2015-01-16T17:04:41.303+0800:
> >> [ParNew:
> >> >> > > 27203K->7185K(31488K), 0.0066950 secs] 27203K->7185K(115456K),
> >> >> 0.0068130
> >> >> > > secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
> >> >> > > 2015-01-16T17:04:41.504+0800: [GC2015-01-16T17:04:41.504+0800:
> >> [ParNew:
> >> >> > > 28177K->5597K(31488K), 0.0091630 secs] 28177K->6523K(115456K),
> >> >> 0.0092430
> >> >> > > secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
> >> >> > > 2015-01-16T17:04:41.778+0800: [GC2015-01-16T17:04:41.778+0800:
> >> [ParNew:
> >> >> > > 26589K->6837K(31488K), 0.0067280 secs] 27515K->7764K(115456K),
> >> >> 0.0068160
> >> >> > > secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
> >> >> > > ?count
> >> >> > > -------------------------------
> >> >> > > 4487999
> >> >> > > (1 rows, 5.323 sec, 8 B selected)
> >> >> > >
> >> >> > > On Fri, Jan 16, 2015 at 5:03 PM, Azuryy Yu <az...@gmail.com>
> >> wrote:
> >> >> > >
> >> >> > > > Hi,
> >> >> > > > There is no big improvement, sometimes more slower than
> before. I
> >> >> also
> >> >> > > try
> >> >> > > > to increase worker's heap size and parallel, nothing improve.
> >> >> > > >
> >> >> > > > default> select count(distinct auid) from test_pl_00_0;
> >> >> > > > Progress: 0%, response time: 0.963 sec
> >> >> > > > Progress: 0%, response time: 0.964 sec
> >> >> > > > Progress: 0%, response time: 1.366 sec
> >> >> > > > Progress: 0%, response time: 2.168 sec
> >> >> > > > Progress: 0%, response time: 3.17 sec
> >> >> > > > Progress: 0%, response time: 4.172 sec
> >> >> > > > Progress: 16%, response time: 5.174 sec
> >> >> > > > Progress: 16%, response time: 6.176 sec
> >> >> > > > Progress: 16%, response time: 7.178 sec
> >> >> > > > Progress: 33%, response time: 8.18 sec
> >> >> > > > Progress: 50%, response time: 9.181 sec
> >> >> > > > Progress: 50%, response time: 10.183 sec
> >> >> > > > Progress: 50%, response time: 11.185 sec
> >> >> > > > Progress: 50%, response time: 12.187 sec
> >> >> > > > Progress: 66%, response time: 13.189 sec
> >> >> > > > Progress: 66%, response time: 14.19 sec
> >> >> > > > Progress: 100%, response time: 15.003 sec
> >> >> > > > 2015-01-16T17:00:56.410+0800: [GC2015-01-16T17:00:56.410+0800:
> >> >> [ParNew:
> >> >> > > > 26473K->6582K(31488K), 0.0105030 secs] 26473K->6582K(115456K),
> >> >> > 0.0105720
> >> >> > > > secs] [Times: user=0.04 sys=0.00, real=0.01 secs]
> >> >> > > > 2015-01-16T17:00:56.593+0800: [GC2015-01-16T17:00:56.593+0800:
> >> >> [ParNew:
> >> >> > > > 27574K->6469K(31488K), 0.0086300 secs] 27574K->6469K(115456K),
> >> >> > 0.0086940
> >> >> > > > secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
> >> >> > > > 2015-01-16T17:00:56.800+0800: [GC2015-01-16T17:00:56.800+0800:
> >> >> [ParNew:
> >> >> > > > 27461K->5664K(31488K), 0.0122560 secs] 27461K->6591K(115456K),
> >> >> > 0.0123210
> >> >> > > > secs] [Times: user=0.02 sys=0.01, real=0.01 secs]
> >> >> > > > 2015-01-16T17:00:57.065+0800: [GC2015-01-16T17:00:57.065+0800:
> >> >> [ParNew:
> >> >> > > > 26656K->6906K(31488K), 0.0070520 secs] 27583K->7833K(115456K),
> >> >> > 0.0071470
> >> >> > > > secs] [Times: user=0.03 sys=0.00, real=0.01 secs]
> >> >> > > > ?count
> >> >> > > > -------------------------------
> >> >> > > > 1222356
> >> >> > > > (1 rows, 15.003 sec, 8 B selected)
> >> >> > > >
> >> >> > > >
> >> >> > > > On Fri, Jan 16, 2015 at 4:09 PM, Azuryy Yu <azuryyyu@gmail.com
> >
> >> >> wrote:
> >> >> > > >
> >> >> > > >> Thanks Kim, I'll try and post back.
> >> >> > > >>
> >> >> > > >> On Fri, Jan 16, 2015 at 4:02 PM, Jinho Kim <jh...@apache.org>
> >> >> wrote:
> >> >> > > >>
> >> >> > > >>> Thanks Azuryy Yu
> >> >> > > >>>
> >> >> > > >>> Your parallel running tasks of tajo-worker is 10 but heap
> >> memory is
> >> >> > > 3GB.
> >> >> > > >>> It
> >> >> > > >>> cause a long JVM pause
> >> >> > > >>> I recommend following :
> >> >> > > >>>
> >> >> > > >>> tajo-env.sh
> >> >> > > >>> TAJO_WORKER_HEAPSIZE=3000 or more
> >> >> > > >>>
> >> >> > > >>> tajo-site.xml
> >> >> > > >>> <!--  worker  -->
> >> >> > > >>> <property>
> >> >> > > >>>   <name>tajo.worker.resource.memory-mb</name>
> >> >> > > >>>   <value>3512</value> <!--  3 tasks + 1 qm task  -->
> >> >> > > >>> </property>
> >> >> > > >>> <property>
> >> >> > > >>>   <name>tajo.task.memory-slot-mb.default</name>
> >> >> > > >>>   <value>1000</value> <!--  default 512 -->
> >> >> > > >>> </property>
> >> >> > > >>> <property>
> >> >> > > >>>    <name>tajo.worker.resource.dfs-dir-aware</name>
> >> >> > > >>>    <value>true</value>
> >> >> > > >>> </property>
> >> >> > > >>> <!--  end  -->
> >> >> > > >>>
> >> >> > >
> >> >> >
> >> >>
> >>
> http://tajo.apache.org/docs/0.9.0/configuration/worker_configuration.html
> >> >> > > >>>
> >> >> > > >>> -Jinho
> >> >> > > >>> Best regards
> >> >> > > >>>
> >> >> > > >>> 2015-01-16 16:02 GMT+09:00 Azuryy Yu <az...@gmail.com>:
> >> >> > > >>>
> >> >> > > >>> > Thanks Kim.
> >> >> > > >>> >
> >> >> > > >>> > The following is my tajo-env and tajo-site
> >> >> > > >>> >
> >> >> > > >>> > *tajo-env.sh:*
> >> >> > > >>> > export HADOOP_HOME=/usr/local/hadoop
> >> >> > > >>> > export JAVA_HOME=/usr/local/java
> >> >> > > >>> > _TAJO_OPTS="-server -verbose:gc
> >> >> > > >>> >   -XX:+PrintGCDateStamps
> >> >> > > >>> >   -XX:+PrintGCDetails
> >> >> > > >>> >   -XX:+UseGCLogFileRotation
> >> >> > > >>> >   -XX:NumberOfGCLogFiles=9
> >> >> > > >>> >   -XX:GCLogFileSize=256m
> >> >> > > >>> >   -XX:+DisableExplicitGC
> >> >> > > >>> >   -XX:+UseCompressedOops
> >> >> > > >>> >   -XX:SoftRefLRUPolicyMSPerMB=0
> >> >> > > >>> >   -XX:+UseFastAccessorMethods
> >> >> > > >>> >   -XX:+UseParNewGC
> >> >> > > >>> >   -XX:+UseConcMarkSweepGC
> >> >> > > >>> >   -XX:+CMSParallelRemarkEnabled
> >> >> > > >>> >   -XX:CMSInitiatingOccupancyFraction=70
> >> >> > > >>> >   -XX:+UseCMSCompactAtFullCollection
> >> >> > > >>> >   -XX:CMSFullGCsBeforeCompaction=0
> >> >> > > >>> >   -XX:+CMSClassUnloadingEnabled
> >> >> > > >>> >   -XX:CMSMaxAbortablePrecleanTime=300
> >> >> > > >>> >   -XX:+CMSScavengeBeforeRemark
> >> >> > > >>> >   -XX:PermSize=160m
> >> >> > > >>> >   -XX:GCTimeRatio=19
> >> >> > > >>> >   -XX:SurvivorRatio=2
> >> >> > > >>> >   -XX:MaxTenuringThreshold=60"
> >> >> > > >>> > _TAJO_MASTER_OPTS="$_TAJO_OPTS -Xmx512m -Xms512m -Xmn256m"
> >> >> > > >>> > _TAJO_WORKER_OPTS="$_TAJO_OPTS -Xmx3g -Xms3g -Xmn1g"
> >> >> > > >>> > _TAJO_QUERYMASTER_OPTS="$_TAJO_OPTS -Xmx512m -Xms512m
> >> -Xmn256m"
> >> >> > > >>> > export TAJO_OPTS=$_TAJO_OPTS
> >> >> > > >>> > export TAJO_MASTER_OPTS=$_TAJO_MASTER_OPTS
> >> >> > > >>> > export TAJO_WORKER_OPTS=$_TAJO_WORKER_OPTS
> >> >> > > >>> > export TAJO_QUERYMASTER_OPTS=$_TAJO_QUERYMASTER_OPTS
> >> >> > > >>> > export TAJO_LOG_DIR=${TAJO_HOME}/logs
> >> >> > > >>> > export TAJO_PID_DIR=${TAJO_HOME}/pids
> >> >> > > >>> > export TAJO_WORKER_STANDBY_MODE=true
> >> >> > > >>> >
> >> >> > > >>> > *tajo-site.xml:*
> >> >> > > >>> >
> >> >> > > >>> > <configuration>
> >> >> > > >>> >   <property>
> >> >> > > >>> >     <name>tajo.rootdir</name>
> >> >> > > >>> >     <value>hdfs://test-cluster/tajo</value>
> >> >> > > >>> >   </property>
> >> >> > > >>> >   <property>
> >> >> > > >>> >     <name>tajo.master.umbilical-rpc.address</name>
> >> >> > > >>> >     <value>10-0-86-51:26001</value>
> >> >> > > >>> >   </property>
> >> >> > > >>> >   <property>
> >> >> > > >>> >     <name>tajo.master.client-rpc.address</name>
> >> >> > > >>> >     <value>10-0-86-51:26002</value>
> >> >> > > >>> >   </property>
> >> >> > > >>> >   <property>
> >> >> > > >>> >     <name>tajo.resource-tracker.rpc.address</name>
> >> >> > > >>> >     <value>10-0-86-51:26003</value>
> >> >> > > >>> >   </property>
> >> >> > > >>> >   <property>
> >> >> > > >>> >     <name>tajo.catalog.client-rpc.address</name>
> >> >> > > >>> >     <value>10-0-86-51:26005</value>
> >> >> > > >>> >   </property>
> >> >> > > >>> >   <property>
> >> >> > > >>> >     <name>tajo.worker.tmpdir.locations</name>
> >> >> > > >>> >     <value>/test/tajo1,/test/tajo2,/test/tajo3</value>
> >> >> > > >>> >   </property>
> >> >> > > >>> >   <!--  worker  -->
> >> >> > > >>> >   <property>
> >> >> > > >>> >
> >> >> >  <name>tajo.worker.resource.tajo.worker.resource.cpu-cores</name>
> >> >> > > >>> >     <value>4</value>
> >> >> > > >>> >   </property>
> >> >> > > >>> >  <property>
> >> >> > > >>> >    <name>tajo.worker.resource.memory-mb</name>
> >> >> > > >>> >    <value>5120</value>
> >> >> > > >>> >  </property>
> >> >> > > >>> >   <property>
> >> >> > > >>> >     <name>tajo.worker.resource.dfs-dir-aware</name>
> >> >> > > >>> >     <value>true</value>
> >> >> > > >>> >   </property>
> >> >> > > >>> >   <property>
> >> >> > > >>> >     <name>tajo.worker.resource.dedicated</name>
> >> >> > > >>> >     <value>true</value>
> >> >> > > >>> >   </property>
> >> >> > > >>> >   <property>
> >> >> > > >>> >
>  <name>tajo.worker.resource.dedicated-memory-ratio</name>
> >> >> > > >>> >     <value>0.6</value>
> >> >> > > >>> >   </property>
> >> >> > > >>> > </configuration>
> >> >> > > >>> >
> >> >> > > >>> > On Fri, Jan 16, 2015 at 2:50 PM, Jinho Kim <
> jhkim@apache.org>
> >> >> > wrote:
> >> >> > > >>> >
> >> >> > > >>> > > Hello Azuyy yu
> >> >> > > >>> > >
> >> >> > > >>> > > I left some comments.
> >> >> > > >>> > >
> >> >> > > >>> > > -Jinho
> >> >> > > >>> > > Best regards
> >> >> > > >>> > >
> >> >> > > >>> > > 2015-01-16 14:37 GMT+09:00 Azuryy Yu <azuryyyu@gmail.com
> >:
> >> >> > > >>> > >
> >> >> > > >>> > > > Hi,
> >> >> > > >>> > > >
> >> >> > > >>> > > > I tested Tajo before half a year, then not focus on
> Tajo
> >> >> > because
> >> >> > > >>> some
> >> >> > > >>> > > other
> >> >> > > >>> > > > works.
> >> >> > > >>> > > >
> >> >> > > >>> > > > then I setup a small dev Tajo cluster this week.(six
> >> nodes,
> >> >> VM)
> >> >> > > >>> based
> >> >> > > >>> > on
> >> >> > > >>> > > > Hadoop-2.6.0.
> >> >> > > >>> > > >
> >> >> > > >>> > > > so my questions is:
> >> >> > > >>> > > >
> >> >> > > >>> > > > 1) From I know half a yea ago, Tajo is work on Yarn,
> using
> >> >> Yarn
> >> >> > > >>> > scheduler
> >> >> > > >>> > > > to manage  job resources. but now I found it doesn't
> rely
> >> on
> >> >> > > Yarn,
> >> >> > > >>> > > because
> >> >> > > >>> > > > I only start HDFS daemons, no yarn daemons. so Tajo has
> >> his
> >> >> own
> >> >> > > job
> >> >> > > >>> > > > sheduler ?
> >> >> > > >>> > > >
> >> >> > > >>> > > >
> >> >> > > >>> > > Now, tajo does using own task scheduler. and  You can
> start
> >> >> tajo
> >> >> > > >>> without
> >> >> > > >>> > > Yarn daemons
> >> >> > > >>> > > Please refer to
> >> >> > > http://tajo.apache.org/docs/0.9.0/configuration.html
> >> >> > > >>> > >
> >> >> > > >>> > >
> >> >> > > >>> > > >
> >> >> > > >>> > > > 2) Does that we need to put the file replications on
> every
> >> >> > nodes
> >> >> > > on
> >> >> > > >>> > Tajo
> >> >> > > >>> > > > cluster?
> >> >> > > >>> > > >
> >> >> > > >>> > >
> >> >> > > >>> > > No, tajo does not need more replication.  if you set more
> >> >> > > >>> replication,
> >> >> > > >>> > data
> >> >> > > >>> > > locality can be increased
> >> >> > > >>> > >
> >> >> > > >>> > > such as I have a six nodes Tajo cluster, then should I
> set
> >> HDFS
> >> >> > > block
> >> >> > > >>> > > > replication to six? because:
> >> >> > > >>> > > >
> >> >> > > >>> > > > I noticed when I run Tajo query, some nodes are busy,
> but
> >> >> some
> >> >> > is
> >> >> > > >>> free.
> >> >> > > >>> > > > because the file's blocks are only located on these
> nodes.
> >> >> non
> >> >> > > >>> others.
> >> >> > > >>> > > >
> >> >> > > >>> > > >
> >> >> > > >>> > > In my opinion, you need to run balancer
> >> >> > > >>> > >
> >> >> > > >>> > >
> >> >> > > >>> >
> >> >> > > >>>
> >> >> > >
> >> >> >
> >> >>
> >>
> http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/HDFSCommands.html#balancer
> >> >> > > >>> > >
> >> >> > > >>> > >
> >> >> > > >>> > > 3)the test data set is 4 million rows. nearly several GB.
> >> but
> >> >> > it's
> >> >> > > >>> very
> >> >> > > >>> > > > slow when I runing: select count(distinct ID) from
> ****;
> >> >> > > >>> > > > Any possible problems here?
> >> >> > > >>> > > >
> >> >> > > >>> > >
> >> >> > > >>> > > Could you share tajo-env.sh, tajo-site.xml ?
> >> >> > > >>> > >
> >> >> > > >>> > >
> >> >> > > >>> > > >
> >> >> > > >>> > > >
> >> >> > > >>> > > > Thanks
> >> >> > > >>> > > >
> >> >> > > >>> > >
> >> >> > > >>> >
> >> >> > > >>>
> >> >> > > >>
> >> >> > > >>
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >>
>

Re: Some Tajo-0.9.0 questions

Posted by Hyunsik Choi <hy...@apache.org>.

Thank you for sharing the machine information.

In my opinion, we can boost up Tajo performance very much in the
machine with proper configuration if the server is dedicated for Tajo.
I think that the configuration that we mentioned above only uses some
of the physical resources in the machine :)

Warm regards,
Hyunsik

On Sun, Jan 18, 2015 at 8:10 PM, Azuryy Yu <az...@gmail.com> wrote:
> Thanks Tyunsik.
>
> I asked our infra team, my 6 nodes Tajo cluster were visulized from one
> host. that's mean I run 6 nodes Tajo cluster on one phisical host.(24cpu,
> 64G mem,  4T*12 HDD)
>
> so I think this was the real performance bottle neck.
>
>
>
> On Mon, Jan 19, 2015 at 11:12 AM, Hyunsik Choi <hy...@apache.org> wrote:
>
>> Hi Azuryy,
>>
>> Tajo automatically rewrites distinct aggregation queries into
>> multi-level aggregations. The query rewrite that Jinho suggested may
>> be already involved.
>>
>> I think that your query response times (12 ~ 15 secs) for distinct
>> count seems to be reasonable because just count aggregation takes 5
>> secs. Usually, distinct aggregation queries are much more slower than
>> just aggregation queries because distinct aggregation involves sort,
>> large intermediate data, and only distinct value handling.
>>
>> In addition, I have a question for more better configuration guide.
>> Could you share available CPU, memory and disks for Tajo?
>>
>> Even though Jinho suggested one, there is still room to set exact and
>> better configurations. Since the resource configuration determines the
>> number of concurrent tasks, it may be main cause of your performance
>> problem.
>>
>> Best regards,
>> Hyunsik
>>
>> On Sun, Jan 18, 2015 at 6:54 PM, Jinho Kim <jh...@apache.org> wrote:
>> >  Sorry for my mistake example query.
>> > Can you change to “select count(a.auid) from ( select auid from
>> > test_pl_00_0 group by auid ) a;” ?
>> >
>> > -Jinho
>> > Best regards
>> >
>> > 2015-01-19 11:44 GMT+09:00 Azuryy Yu <az...@gmail.com>:
>> >
>> >> Sorry for no response during weekend.
>> >> I changed hdfs-site.xml and restart hdfs and tajo.but  It's more slow
>> than
>> >> before.
>> >>
>> >> default> select count(a.auid) from ( select auid from test_pl_00_0 ) a;
>> >> Progress: 0%, response time: 1.132 sec
>> >> Progress: 0%, response time: 1.134 sec
>> >> Progress: 0%, response time: 1.536 sec
>> >> Progress: 0%, response time: 2.338 sec
>> >> Progress: 0%, response time: 3.341 sec
>> >> Progress: 3%, response time: 4.343 sec
>> >> Progress: 4%, response time: 5.346 sec
>> >> Progress: 9%, response time: 6.35 sec
>> >> Progress: 11%, response time: 7.352 sec
>> >> Progress: 16%, response time: 8.354 sec
>> >> Progress: 18%, response time: 9.362 sec
>> >> Progress: 24%, response time: 10.364 sec
>> >> Progress: 27%, response time: 11.366 sec
>> >> Progress: 29%, response time: 12.368 sec
>> >> Progress: 32%, response time: 13.37 sec
>> >> Progress: 37%, response time: 14.373 sec
>> >> Progress: 40%, response time: 15.377 sec
>> >> Progress: 42%, response time: 16.379 sec
>> >> Progress: 42%, response time: 17.382 sec
>> >> Progress: 43%, response time: 18.384 sec
>> >> Progress: 43%, response time: 19.386 sec
>> >> Progress: 45%, response time: 20.388 sec
>> >> Progress: 45%, response time: 21.391 sec
>> >> Progress: 46%, response time: 22.393 sec
>> >> Progress: 46%, response time: 23.395 sec
>> >> Progress: 48%, response time: 24.398 sec
>> >> Progress: 48%, response time: 25.401 sec
>> >> Progress: 50%, response time: 26.403 sec
>> >> Progress: 100%, response time: 26.95 sec
>> >> ?count
>> >> -------------------------------
>> >> 4487999
>> >> (1 rows, 26.95 sec, 8 B selected)
>> >> default> select count(distinct auid) from test_pl_00_0;
>> >> Progress: 0%, response time: 0.88 sec
>> >> Progress: 0%, response time: 0.881 sec
>> >> Progress: 0%, response time: 1.283 sec
>> >> Progress: 0%, response time: 2.086 sec
>> >> Progress: 0%, response time: 3.088 sec
>> >> Progress: 0%, response time: 4.09 sec
>> >> Progress: 25%, response time: 5.092 sec
>> >> Progress: 33%, response time: 6.094 sec
>> >> Progress: 50%, response time: 7.096 sec
>> >> Progress: 50%, response time: 8.098 sec
>> >> Progress: 50%, response time: 9.099 sec
>> >> Progress: 66%, response time: 10.101 sec
>> >> Progress: 66%, response time: 11.103 sec
>> >> Progress: 83%, response time: 12.105 sec
>> >> Progress: 100%, response time: 12.268 sec
>> >> ?count
>> >> -------------------------------
>> >> 1222356
>> >> (1 rows, 12.268 sec, 8 B selected)
>> >>
>> >> On Sat, Jan 17, 2015 at 11:00 PM, Jinho Kim <jh...@apache.org> wrote:
>> >>
>> >> >  Thank you for your sharing
>> >> >
>> >> > Can you enable the dfs.datanode.hdfs-blocks-metadata.enabled in
>> >> > hdfs-site.xml ?
>> >> > If you enable the block-metadata, tajo-cluster can use the volume load
>> >> > balancing. You should restart the datanode and tajo cluster. I will
>> >> > investigate performance of count-distinct operator. and You can
>> change to
>> >> > “select count(a.auid) from ( select auid from test_pl_00_0 ) a”
>> >> >
>> >> >
>> >> > -Jinho
>> >> > Best regards
>> >> >
>> >> > 2015-01-16 18:05 GMT+09:00 Azuryy Yu <az...@gmail.com>:
>> >> >
>> >> > > default> select count(*) from test_pl_00_0;
>> >> > > Progress: 0%, response time: 0.718 sec
>> >> > > Progress: 0%, response time: 0.72 sec
>> >> > > Progress: 0%, response time: 1.121 sec
>> >> > > Progress: 12%, response time: 1.923 sec
>> >> > > Progress: 28%, response time: 2.925 sec
>> >> > > Progress: 41%, response time: 3.927 sec
>> >> > > Progress: 50%, response time: 4.931 sec
>> >> > > Progress: 100%, response time: 5.323 sec
>> >> > > 2015-01-16T17:04:41.116+0800: [GC2015-01-16T17:04:41.116+0800:
>> [ParNew:
>> >> > > 26543K->6211K(31488K), 0.0079770 secs] 26543K->6211K(115456K),
>> >> 0.0080700
>> >> > > secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
>> >> > > 2015-01-16T17:04:41.303+0800: [GC2015-01-16T17:04:41.303+0800:
>> [ParNew:
>> >> > > 27203K->7185K(31488K), 0.0066950 secs] 27203K->7185K(115456K),
>> >> 0.0068130
>> >> > > secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
>> >> > > 2015-01-16T17:04:41.504+0800: [GC2015-01-16T17:04:41.504+0800:
>> [ParNew:
>> >> > > 28177K->5597K(31488K), 0.0091630 secs] 28177K->6523K(115456K),
>> >> 0.0092430
>> >> > > secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
>> >> > > 2015-01-16T17:04:41.778+0800: [GC2015-01-16T17:04:41.778+0800:
>> [ParNew:
>> >> > > 26589K->6837K(31488K), 0.0067280 secs] 27515K->7764K(115456K),
>> >> 0.0068160
>> >> > > secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
>> >> > > ?count
>> >> > > -------------------------------
>> >> > > 4487999
>> >> > > (1 rows, 5.323 sec, 8 B selected)
>> >> > >
>> >> > > On Fri, Jan 16, 2015 at 5:03 PM, Azuryy Yu <az...@gmail.com>
>> wrote:
>> >> > >
>> >> > > > Hi,
>> >> > > > There is no big improvement, sometimes more slower than before. I
>> >> also
>> >> > > try
>> >> > > > to increase worker's heap size and parallel, nothing improve.
>> >> > > >
>> >> > > > default> select count(distinct auid) from test_pl_00_0;
>> >> > > > Progress: 0%, response time: 0.963 sec
>> >> > > > Progress: 0%, response time: 0.964 sec
>> >> > > > Progress: 0%, response time: 1.366 sec
>> >> > > > Progress: 0%, response time: 2.168 sec
>> >> > > > Progress: 0%, response time: 3.17 sec
>> >> > > > Progress: 0%, response time: 4.172 sec
>> >> > > > Progress: 16%, response time: 5.174 sec
>> >> > > > Progress: 16%, response time: 6.176 sec
>> >> > > > Progress: 16%, response time: 7.178 sec
>> >> > > > Progress: 33%, response time: 8.18 sec
>> >> > > > Progress: 50%, response time: 9.181 sec
>> >> > > > Progress: 50%, response time: 10.183 sec
>> >> > > > Progress: 50%, response time: 11.185 sec
>> >> > > > Progress: 50%, response time: 12.187 sec
>> >> > > > Progress: 66%, response time: 13.189 sec
>> >> > > > Progress: 66%, response time: 14.19 sec
>> >> > > > Progress: 100%, response time: 15.003 sec
>> >> > > > 2015-01-16T17:00:56.410+0800: [GC2015-01-16T17:00:56.410+0800:
>> >> [ParNew:
>> >> > > > 26473K->6582K(31488K), 0.0105030 secs] 26473K->6582K(115456K),
>> >> > 0.0105720
>> >> > > > secs] [Times: user=0.04 sys=0.00, real=0.01 secs]
>> >> > > > 2015-01-16T17:00:56.593+0800: [GC2015-01-16T17:00:56.593+0800:
>> >> [ParNew:
>> >> > > > 27574K->6469K(31488K), 0.0086300 secs] 27574K->6469K(115456K),
>> >> > 0.0086940
>> >> > > > secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
>> >> > > > 2015-01-16T17:00:56.800+0800: [GC2015-01-16T17:00:56.800+0800:
>> >> [ParNew:
>> >> > > > 27461K->5664K(31488K), 0.0122560 secs] 27461K->6591K(115456K),
>> >> > 0.0123210
>> >> > > > secs] [Times: user=0.02 sys=0.01, real=0.01 secs]
>> >> > > > 2015-01-16T17:00:57.065+0800: [GC2015-01-16T17:00:57.065+0800:
>> >> [ParNew:
>> >> > > > 26656K->6906K(31488K), 0.0070520 secs] 27583K->7833K(115456K),
>> >> > 0.0071470
>> >> > > > secs] [Times: user=0.03 sys=0.00, real=0.01 secs]
>> >> > > > ?count
>> >> > > > -------------------------------
>> >> > > > 1222356
>> >> > > > (1 rows, 15.003 sec, 8 B selected)
>> >> > > >
>> >> > > >
>> >> > > > On Fri, Jan 16, 2015 at 4:09 PM, Azuryy Yu <az...@gmail.com>
>> >> wrote:
>> >> > > >
>> >> > > >> Thanks Kim, I'll try and post back.
>> >> > > >>
>> >> > > >> On Fri, Jan 16, 2015 at 4:02 PM, Jinho Kim <jh...@apache.org>
>> >> wrote:
>> >> > > >>
>> >> > > >>> Thanks Azuryy Yu
>> >> > > >>>
>> >> > > >>> Your parallel running tasks of tajo-worker is 10 but heap
>> memory is
>> >> > > 3GB.
>> >> > > >>> It
>> >> > > >>> cause a long JVM pause
>> >> > > >>> I recommend following :
>> >> > > >>>
>> >> > > >>> tajo-env.sh
>> >> > > >>> TAJO_WORKER_HEAPSIZE=3000 or more
>> >> > > >>>
>> >> > > >>> tajo-site.xml
>> >> > > >>> <!--  worker  -->
>> >> > > >>> <property>
>> >> > > >>>   <name>tajo.worker.resource.memory-mb</name>
>> >> > > >>>   <value>3512</value> <!--  3 tasks + 1 qm task  -->
>> >> > > >>> </property>
>> >> > > >>> <property>
>> >> > > >>>   <name>tajo.task.memory-slot-mb.default</name>
>> >> > > >>>   <value>1000</value> <!--  default 512 -->
>> >> > > >>> </property>
>> >> > > >>> <property>
>> >> > > >>>    <name>tajo.worker.resource.dfs-dir-aware</name>
>> >> > > >>>    <value>true</value>
>> >> > > >>> </property>
>> >> > > >>> <!--  end  -->
>> >> > > >>>
>> >> > >
>> >> >
>> >>
>> http://tajo.apache.org/docs/0.9.0/configuration/worker_configuration.html
>> >> > > >>>
>> >> > > >>> -Jinho
>> >> > > >>> Best regards
>> >> > > >>>
>> >> > > >>> 2015-01-16 16:02 GMT+09:00 Azuryy Yu <az...@gmail.com>:
>> >> > > >>>
>> >> > > >>> > Thanks Kim.
>> >> > > >>> >
>> >> > > >>> > The following is my tajo-env and tajo-site
>> >> > > >>> >
>> >> > > >>> > *tajo-env.sh:*
>> >> > > >>> > export HADOOP_HOME=/usr/local/hadoop
>> >> > > >>> > export JAVA_HOME=/usr/local/java
>> >> > > >>> > _TAJO_OPTS="-server -verbose:gc
>> >> > > >>> >   -XX:+PrintGCDateStamps
>> >> > > >>> >   -XX:+PrintGCDetails
>> >> > > >>> >   -XX:+UseGCLogFileRotation
>> >> > > >>> >   -XX:NumberOfGCLogFiles=9
>> >> > > >>> >   -XX:GCLogFileSize=256m
>> >> > > >>> >   -XX:+DisableExplicitGC
>> >> > > >>> >   -XX:+UseCompressedOops
>> >> > > >>> >   -XX:SoftRefLRUPolicyMSPerMB=0
>> >> > > >>> >   -XX:+UseFastAccessorMethods
>> >> > > >>> >   -XX:+UseParNewGC
>> >> > > >>> >   -XX:+UseConcMarkSweepGC
>> >> > > >>> >   -XX:+CMSParallelRemarkEnabled
>> >> > > >>> >   -XX:CMSInitiatingOccupancyFraction=70
>> >> > > >>> >   -XX:+UseCMSCompactAtFullCollection
>> >> > > >>> >   -XX:CMSFullGCsBeforeCompaction=0
>> >> > > >>> >   -XX:+CMSClassUnloadingEnabled
>> >> > > >>> >   -XX:CMSMaxAbortablePrecleanTime=300
>> >> > > >>> >   -XX:+CMSScavengeBeforeRemark
>> >> > > >>> >   -XX:PermSize=160m
>> >> > > >>> >   -XX:GCTimeRatio=19
>> >> > > >>> >   -XX:SurvivorRatio=2
>> >> > > >>> >   -XX:MaxTenuringThreshold=60"
>> >> > > >>> > _TAJO_MASTER_OPTS="$_TAJO_OPTS -Xmx512m -Xms512m -Xmn256m"
>> >> > > >>> > _TAJO_WORKER_OPTS="$_TAJO_OPTS -Xmx3g -Xms3g -Xmn1g"
>> >> > > >>> > _TAJO_QUERYMASTER_OPTS="$_TAJO_OPTS -Xmx512m -Xms512m
>> -Xmn256m"
>> >> > > >>> > export TAJO_OPTS=$_TAJO_OPTS
>> >> > > >>> > export TAJO_MASTER_OPTS=$_TAJO_MASTER_OPTS
>> >> > > >>> > export TAJO_WORKER_OPTS=$_TAJO_WORKER_OPTS
>> >> > > >>> > export TAJO_QUERYMASTER_OPTS=$_TAJO_QUERYMASTER_OPTS
>> >> > > >>> > export TAJO_LOG_DIR=${TAJO_HOME}/logs
>> >> > > >>> > export TAJO_PID_DIR=${TAJO_HOME}/pids
>> >> > > >>> > export TAJO_WORKER_STANDBY_MODE=true
>> >> > > >>> >
>> >> > > >>> > *tajo-site.xml:*
>> >> > > >>> >
>> >> > > >>> > <configuration>
>> >> > > >>> >   <property>
>> >> > > >>> >     <name>tajo.rootdir</name>
>> >> > > >>> >     <value>hdfs://test-cluster/tajo</value>
>> >> > > >>> >   </property>
>> >> > > >>> >   <property>
>> >> > > >>> >     <name>tajo.master.umbilical-rpc.address</name>
>> >> > > >>> >     <value>10-0-86-51:26001</value>
>> >> > > >>> >   </property>
>> >> > > >>> >   <property>
>> >> > > >>> >     <name>tajo.master.client-rpc.address</name>
>> >> > > >>> >     <value>10-0-86-51:26002</value>
>> >> > > >>> >   </property>
>> >> > > >>> >   <property>
>> >> > > >>> >     <name>tajo.resource-tracker.rpc.address</name>
>> >> > > >>> >     <value>10-0-86-51:26003</value>
>> >> > > >>> >   </property>
>> >> > > >>> >   <property>
>> >> > > >>> >     <name>tajo.catalog.client-rpc.address</name>
>> >> > > >>> >     <value>10-0-86-51:26005</value>
>> >> > > >>> >   </property>
>> >> > > >>> >   <property>
>> >> > > >>> >     <name>tajo.worker.tmpdir.locations</name>
>> >> > > >>> >     <value>/test/tajo1,/test/tajo2,/test/tajo3</value>
>> >> > > >>> >   </property>
>> >> > > >>> >   <!--  worker  -->
>> >> > > >>> >   <property>
>> >> > > >>> >
>> >> >  <name>tajo.worker.resource.tajo.worker.resource.cpu-cores</name>
>> >> > > >>> >     <value>4</value>
>> >> > > >>> >   </property>
>> >> > > >>> >  <property>
>> >> > > >>> >    <name>tajo.worker.resource.memory-mb</name>
>> >> > > >>> >    <value>5120</value>
>> >> > > >>> >  </property>
>> >> > > >>> >   <property>
>> >> > > >>> >     <name>tajo.worker.resource.dfs-dir-aware</name>
>> >> > > >>> >     <value>true</value>
>> >> > > >>> >   </property>
>> >> > > >>> >   <property>
>> >> > > >>> >     <name>tajo.worker.resource.dedicated</name>
>> >> > > >>> >     <value>true</value>
>> >> > > >>> >   </property>
>> >> > > >>> >   <property>
>> >> > > >>> >     <name>tajo.worker.resource.dedicated-memory-ratio</name>
>> >> > > >>> >     <value>0.6</value>
>> >> > > >>> >   </property>
>> >> > > >>> > </configuration>
>> >> > > >>> >
>> >> > > >>> > On Fri, Jan 16, 2015 at 2:50 PM, Jinho Kim <jh...@apache.org>
>> >> > wrote:
>> >> > > >>> >
>> >> > > >>> > > Hello Azuyy yu
>> >> > > >>> > >
>> >> > > >>> > > I left some comments.
>> >> > > >>> > >
>> >> > > >>> > > -Jinho
>> >> > > >>> > > Best regards
>> >> > > >>> > >
>> >> > > >>> > > 2015-01-16 14:37 GMT+09:00 Azuryy Yu <az...@gmail.com>:
>> >> > > >>> > >
>> >> > > >>> > > > Hi,
>> >> > > >>> > > >
>> >> > > >>> > > > I tested Tajo before half a year, then not focus on Tajo
>> >> > because
>> >> > > >>> some
>> >> > > >>> > > other
>> >> > > >>> > > > works.
>> >> > > >>> > > >
>> >> > > >>> > > > then I setup a small dev Tajo cluster this week.(six
>> nodes,
>> >> VM)
>> >> > > >>> based
>> >> > > >>> > on
>> >> > > >>> > > > Hadoop-2.6.0.
>> >> > > >>> > > >
>> >> > > >>> > > > so my questions is:
>> >> > > >>> > > >
>> >> > > >>> > > > 1) From I know half a yea ago, Tajo is work on Yarn, using
>> >> Yarn
>> >> > > >>> > scheduler
>> >> > > >>> > > > to manage  job resources. but now I found it doesn't rely
>> on
>> >> > > Yarn,
>> >> > > >>> > > because
>> >> > > >>> > > > I only start HDFS daemons, no yarn daemons. so Tajo has
>> his
>> >> own
>> >> > > job
>> >> > > >>> > > > sheduler ?
>> >> > > >>> > > >
>> >> > > >>> > > >
>> >> > > >>> > > Now, tajo does using own task scheduler. and  You can start
>> >> tajo
>> >> > > >>> without
>> >> > > >>> > > Yarn daemons
>> >> > > >>> > > Please refer to
>> >> > > http://tajo.apache.org/docs/0.9.0/configuration.html
>> >> > > >>> > >
>> >> > > >>> > >
>> >> > > >>> > > >
>> >> > > >>> > > > 2) Does that we need to put the file replications on every
>> >> > nodes
>> >> > > on
>> >> > > >>> > Tajo
>> >> > > >>> > > > cluster?
>> >> > > >>> > > >
>> >> > > >>> > >
>> >> > > >>> > > No, tajo does not need more replication.  if you set more
>> >> > > >>> replication,
>> >> > > >>> > data
>> >> > > >>> > > locality can be increased
>> >> > > >>> > >
>> >> > > >>> > > such as I have a six nodes Tajo cluster, then should I set
>> HDFS
>> >> > > block
>> >> > > >>> > > > replication to six? because:
>> >> > > >>> > > >
>> >> > > >>> > > > I noticed when I run Tajo query, some nodes are busy, but
>> >> some
>> >> > is
>> >> > > >>> free.
>> >> > > >>> > > > because the file's blocks are only located on these nodes.
>> >> non
>> >> > > >>> others.
>> >> > > >>> > > >
>> >> > > >>> > > >
>> >> > > >>> > > In my opinion, you need to run balancer
>> >> > > >>> > >
>> >> > > >>> > >
>> >> > > >>> >
>> >> > > >>>
>> >> > >
>> >> >
>> >>
>> http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/HDFSCommands.html#balancer
>> >> > > >>> > >
>> >> > > >>> > >
>> >> > > >>> > > 3)the test data set is 4 million rows. nearly several GB.
>> but
>> >> > it's
>> >> > > >>> very
>> >> > > >>> > > > slow when I runing: select count(distinct ID) from ****;
>> >> > > >>> > > > Any possible problems here?
>> >> > > >>> > > >
>> >> > > >>> > >
>> >> > > >>> > > Could you share tajo-env.sh, tajo-site.xml ?
>> >> > > >>> > >
>> >> > > >>> > >
>> >> > > >>> > > >
>> >> > > >>> > > >
>> >> > > >>> > > > Thanks
>> >> > > >>> > > >
>> >> > > >>> > >
>> >> > > >>> >
>> >> > > >>>
>> >> > > >>
>> >> > > >>
>> >> > > >
>> >> > >
>> >> >
>> >>
>>

Re: Some Tajo-0.9.0 questions

Posted by Azuryy Yu <az...@gmail.com>.

Thanks Tyunsik.

I asked our infra team, my 6 nodes Tajo cluster were visulized from one
host. that's mean I run 6 nodes Tajo cluster on one phisical host.(24cpu,
64G mem,  4T*12 HDD)

so I think this was the real performance bottle neck.



On Mon, Jan 19, 2015 at 11:12 AM, Hyunsik Choi <hy...@apache.org> wrote:

> Hi Azuryy,
>
> Tajo automatically rewrites distinct aggregation queries into
> multi-level aggregations. The query rewrite that Jinho suggested may
> be already involved.
>
> I think that your query response times (12 ~ 15 secs) for distinct
> count seems to be reasonable because just count aggregation takes 5
> secs. Usually, distinct aggregation queries are much more slower than
> just aggregation queries because distinct aggregation involves sort,
> large intermediate data, and only distinct value handling.
>
> In addition, I have a question for more better configuration guide.
> Could you share available CPU, memory and disks for Tajo?
>
> Even though Jinho suggested one, there is still room to set exact and
> better configurations. Since the resource configuration determines the
> number of concurrent tasks, it may be main cause of your performance
> problem.
>
> Best regards,
> Hyunsik
>
> On Sun, Jan 18, 2015 at 6:54 PM, Jinho Kim <jh...@apache.org> wrote:
> >  Sorry for my mistake example query.
> > Can you change to “select count(a.auid) from ( select auid from
> > test_pl_00_0 group by auid ) a;” ?
> >
> > -Jinho
> > Best regards
> >
> > 2015-01-19 11:44 GMT+09:00 Azuryy Yu <az...@gmail.com>:
> >
> >> Sorry for no response during weekend.
> >> I changed hdfs-site.xml and restart hdfs and tajo.but  It's more slow
> than
> >> before.
> >>
> >> default> select count(a.auid) from ( select auid from test_pl_00_0 ) a;
> >> Progress: 0%, response time: 1.132 sec
> >> Progress: 0%, response time: 1.134 sec
> >> Progress: 0%, response time: 1.536 sec
> >> Progress: 0%, response time: 2.338 sec
> >> Progress: 0%, response time: 3.341 sec
> >> Progress: 3%, response time: 4.343 sec
> >> Progress: 4%, response time: 5.346 sec
> >> Progress: 9%, response time: 6.35 sec
> >> Progress: 11%, response time: 7.352 sec
> >> Progress: 16%, response time: 8.354 sec
> >> Progress: 18%, response time: 9.362 sec
> >> Progress: 24%, response time: 10.364 sec
> >> Progress: 27%, response time: 11.366 sec
> >> Progress: 29%, response time: 12.368 sec
> >> Progress: 32%, response time: 13.37 sec
> >> Progress: 37%, response time: 14.373 sec
> >> Progress: 40%, response time: 15.377 sec
> >> Progress: 42%, response time: 16.379 sec
> >> Progress: 42%, response time: 17.382 sec
> >> Progress: 43%, response time: 18.384 sec
> >> Progress: 43%, response time: 19.386 sec
> >> Progress: 45%, response time: 20.388 sec
> >> Progress: 45%, response time: 21.391 sec
> >> Progress: 46%, response time: 22.393 sec
> >> Progress: 46%, response time: 23.395 sec
> >> Progress: 48%, response time: 24.398 sec
> >> Progress: 48%, response time: 25.401 sec
> >> Progress: 50%, response time: 26.403 sec
> >> Progress: 100%, response time: 26.95 sec
> >> ?count
> >> -------------------------------
> >> 4487999
> >> (1 rows, 26.95 sec, 8 B selected)
> >> default> select count(distinct auid) from test_pl_00_0;
> >> Progress: 0%, response time: 0.88 sec
> >> Progress: 0%, response time: 0.881 sec
> >> Progress: 0%, response time: 1.283 sec
> >> Progress: 0%, response time: 2.086 sec
> >> Progress: 0%, response time: 3.088 sec
> >> Progress: 0%, response time: 4.09 sec
> >> Progress: 25%, response time: 5.092 sec
> >> Progress: 33%, response time: 6.094 sec
> >> Progress: 50%, response time: 7.096 sec
> >> Progress: 50%, response time: 8.098 sec
> >> Progress: 50%, response time: 9.099 sec
> >> Progress: 66%, response time: 10.101 sec
> >> Progress: 66%, response time: 11.103 sec
> >> Progress: 83%, response time: 12.105 sec
> >> Progress: 100%, response time: 12.268 sec
> >> ?count
> >> -------------------------------
> >> 1222356
> >> (1 rows, 12.268 sec, 8 B selected)
> >>
> >> On Sat, Jan 17, 2015 at 11:00 PM, Jinho Kim <jh...@apache.org> wrote:
> >>
> >> >  Thank you for your sharing
> >> >
> >> > Can you enable the dfs.datanode.hdfs-blocks-metadata.enabled in
> >> > hdfs-site.xml ?
> >> > If you enable the block-metadata, tajo-cluster can use the volume load
> >> > balancing. You should restart the datanode and tajo cluster. I will
> >> > investigate performance of count-distinct operator. and You can
> change to
> >> > “select count(a.auid) from ( select auid from test_pl_00_0 ) a”
> >> >
> >> >
> >> > -Jinho
> >> > Best regards
> >> >
> >> > 2015-01-16 18:05 GMT+09:00 Azuryy Yu <az...@gmail.com>:
> >> >
> >> > > default> select count(*) from test_pl_00_0;
> >> > > Progress: 0%, response time: 0.718 sec
> >> > > Progress: 0%, response time: 0.72 sec
> >> > > Progress: 0%, response time: 1.121 sec
> >> > > Progress: 12%, response time: 1.923 sec
> >> > > Progress: 28%, response time: 2.925 sec
> >> > > Progress: 41%, response time: 3.927 sec
> >> > > Progress: 50%, response time: 4.931 sec
> >> > > Progress: 100%, response time: 5.323 sec
> >> > > 2015-01-16T17:04:41.116+0800: [GC2015-01-16T17:04:41.116+0800:
> [ParNew:
> >> > > 26543K->6211K(31488K), 0.0079770 secs] 26543K->6211K(115456K),
> >> 0.0080700
> >> > > secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
> >> > > 2015-01-16T17:04:41.303+0800: [GC2015-01-16T17:04:41.303+0800:
> [ParNew:
> >> > > 27203K->7185K(31488K), 0.0066950 secs] 27203K->7185K(115456K),
> >> 0.0068130
> >> > > secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
> >> > > 2015-01-16T17:04:41.504+0800: [GC2015-01-16T17:04:41.504+0800:
> [ParNew:
> >> > > 28177K->5597K(31488K), 0.0091630 secs] 28177K->6523K(115456K),
> >> 0.0092430
> >> > > secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
> >> > > 2015-01-16T17:04:41.778+0800: [GC2015-01-16T17:04:41.778+0800:
> [ParNew:
> >> > > 26589K->6837K(31488K), 0.0067280 secs] 27515K->7764K(115456K),
> >> 0.0068160
> >> > > secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
> >> > > ?count
> >> > > -------------------------------
> >> > > 4487999
> >> > > (1 rows, 5.323 sec, 8 B selected)
> >> > >
> >> > > On Fri, Jan 16, 2015 at 5:03 PM, Azuryy Yu <az...@gmail.com>
> wrote:
> >> > >
> >> > > > Hi,
> >> > > > There is no big improvement, sometimes more slower than before. I
> >> also
> >> > > try
> >> > > > to increase worker's heap size and parallel, nothing improve.
> >> > > >
> >> > > > default> select count(distinct auid) from test_pl_00_0;
> >> > > > Progress: 0%, response time: 0.963 sec
> >> > > > Progress: 0%, response time: 0.964 sec
> >> > > > Progress: 0%, response time: 1.366 sec
> >> > > > Progress: 0%, response time: 2.168 sec
> >> > > > Progress: 0%, response time: 3.17 sec
> >> > > > Progress: 0%, response time: 4.172 sec
> >> > > > Progress: 16%, response time: 5.174 sec
> >> > > > Progress: 16%, response time: 6.176 sec
> >> > > > Progress: 16%, response time: 7.178 sec
> >> > > > Progress: 33%, response time: 8.18 sec
> >> > > > Progress: 50%, response time: 9.181 sec
> >> > > > Progress: 50%, response time: 10.183 sec
> >> > > > Progress: 50%, response time: 11.185 sec
> >> > > > Progress: 50%, response time: 12.187 sec
> >> > > > Progress: 66%, response time: 13.189 sec
> >> > > > Progress: 66%, response time: 14.19 sec
> >> > > > Progress: 100%, response time: 15.003 sec
> >> > > > 2015-01-16T17:00:56.410+0800: [GC2015-01-16T17:00:56.410+0800:
> >> [ParNew:
> >> > > > 26473K->6582K(31488K), 0.0105030 secs] 26473K->6582K(115456K),
> >> > 0.0105720
> >> > > > secs] [Times: user=0.04 sys=0.00, real=0.01 secs]
> >> > > > 2015-01-16T17:00:56.593+0800: [GC2015-01-16T17:00:56.593+0800:
> >> [ParNew:
> >> > > > 27574K->6469K(31488K), 0.0086300 secs] 27574K->6469K(115456K),
> >> > 0.0086940
> >> > > > secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
> >> > > > 2015-01-16T17:00:56.800+0800: [GC2015-01-16T17:00:56.800+0800:
> >> [ParNew:
> >> > > > 27461K->5664K(31488K), 0.0122560 secs] 27461K->6591K(115456K),
> >> > 0.0123210
> >> > > > secs] [Times: user=0.02 sys=0.01, real=0.01 secs]
> >> > > > 2015-01-16T17:00:57.065+0800: [GC2015-01-16T17:00:57.065+0800:
> >> [ParNew:
> >> > > > 26656K->6906K(31488K), 0.0070520 secs] 27583K->7833K(115456K),
> >> > 0.0071470
> >> > > > secs] [Times: user=0.03 sys=0.00, real=0.01 secs]
> >> > > > ?count
> >> > > > -------------------------------
> >> > > > 1222356
> >> > > > (1 rows, 15.003 sec, 8 B selected)
> >> > > >
> >> > > >
> >> > > > On Fri, Jan 16, 2015 at 4:09 PM, Azuryy Yu <az...@gmail.com>
> >> wrote:
> >> > > >
> >> > > >> Thanks Kim, I'll try and post back.
> >> > > >>
> >> > > >> On Fri, Jan 16, 2015 at 4:02 PM, Jinho Kim <jh...@apache.org>
> >> wrote:
> >> > > >>
> >> > > >>> Thanks Azuryy Yu
> >> > > >>>
> >> > > >>> Your parallel running tasks of tajo-worker is 10 but heap
> memory is
> >> > > 3GB.
> >> > > >>> It
> >> > > >>> cause a long JVM pause
> >> > > >>> I recommend following :
> >> > > >>>
> >> > > >>> tajo-env.sh
> >> > > >>> TAJO_WORKER_HEAPSIZE=3000 or more
> >> > > >>>
> >> > > >>> tajo-site.xml
> >> > > >>> <!--  worker  -->
> >> > > >>> <property>
> >> > > >>>   <name>tajo.worker.resource.memory-mb</name>
> >> > > >>>   <value>3512</value> <!--  3 tasks + 1 qm task  -->
> >> > > >>> </property>
> >> > > >>> <property>
> >> > > >>>   <name>tajo.task.memory-slot-mb.default</name>
> >> > > >>>   <value>1000</value> <!--  default 512 -->
> >> > > >>> </property>
> >> > > >>> <property>
> >> > > >>>    <name>tajo.worker.resource.dfs-dir-aware</name>
> >> > > >>>    <value>true</value>
> >> > > >>> </property>
> >> > > >>> <!--  end  -->
> >> > > >>>
> >> > >
> >> >
> >>
> http://tajo.apache.org/docs/0.9.0/configuration/worker_configuration.html
> >> > > >>>
> >> > > >>> -Jinho
> >> > > >>> Best regards
> >> > > >>>
> >> > > >>> 2015-01-16 16:02 GMT+09:00 Azuryy Yu <az...@gmail.com>:
> >> > > >>>
> >> > > >>> > Thanks Kim.
> >> > > >>> >
> >> > > >>> > The following is my tajo-env and tajo-site
> >> > > >>> >
> >> > > >>> > *tajo-env.sh:*
> >> > > >>> > export HADOOP_HOME=/usr/local/hadoop
> >> > > >>> > export JAVA_HOME=/usr/local/java
> >> > > >>> > _TAJO_OPTS="-server -verbose:gc
> >> > > >>> >   -XX:+PrintGCDateStamps
> >> > > >>> >   -XX:+PrintGCDetails
> >> > > >>> >   -XX:+UseGCLogFileRotation
> >> > > >>> >   -XX:NumberOfGCLogFiles=9
> >> > > >>> >   -XX:GCLogFileSize=256m
> >> > > >>> >   -XX:+DisableExplicitGC
> >> > > >>> >   -XX:+UseCompressedOops
> >> > > >>> >   -XX:SoftRefLRUPolicyMSPerMB=0
> >> > > >>> >   -XX:+UseFastAccessorMethods
> >> > > >>> >   -XX:+UseParNewGC
> >> > > >>> >   -XX:+UseConcMarkSweepGC
> >> > > >>> >   -XX:+CMSParallelRemarkEnabled
> >> > > >>> >   -XX:CMSInitiatingOccupancyFraction=70
> >> > > >>> >   -XX:+UseCMSCompactAtFullCollection
> >> > > >>> >   -XX:CMSFullGCsBeforeCompaction=0
> >> > > >>> >   -XX:+CMSClassUnloadingEnabled
> >> > > >>> >   -XX:CMSMaxAbortablePrecleanTime=300
> >> > > >>> >   -XX:+CMSScavengeBeforeRemark
> >> > > >>> >   -XX:PermSize=160m
> >> > > >>> >   -XX:GCTimeRatio=19
> >> > > >>> >   -XX:SurvivorRatio=2
> >> > > >>> >   -XX:MaxTenuringThreshold=60"
> >> > > >>> > _TAJO_MASTER_OPTS="$_TAJO_OPTS -Xmx512m -Xms512m -Xmn256m"
> >> > > >>> > _TAJO_WORKER_OPTS="$_TAJO_OPTS -Xmx3g -Xms3g -Xmn1g"
> >> > > >>> > _TAJO_QUERYMASTER_OPTS="$_TAJO_OPTS -Xmx512m -Xms512m
> -Xmn256m"
> >> > > >>> > export TAJO_OPTS=$_TAJO_OPTS
> >> > > >>> > export TAJO_MASTER_OPTS=$_TAJO_MASTER_OPTS
> >> > > >>> > export TAJO_WORKER_OPTS=$_TAJO_WORKER_OPTS
> >> > > >>> > export TAJO_QUERYMASTER_OPTS=$_TAJO_QUERYMASTER_OPTS
> >> > > >>> > export TAJO_LOG_DIR=${TAJO_HOME}/logs
> >> > > >>> > export TAJO_PID_DIR=${TAJO_HOME}/pids
> >> > > >>> > export TAJO_WORKER_STANDBY_MODE=true
> >> > > >>> >
> >> > > >>> > *tajo-site.xml:*
> >> > > >>> >
> >> > > >>> > <configuration>
> >> > > >>> >   <property>
> >> > > >>> >     <name>tajo.rootdir</name>
> >> > > >>> >     <value>hdfs://test-cluster/tajo</value>
> >> > > >>> >   </property>
> >> > > >>> >   <property>
> >> > > >>> >     <name>tajo.master.umbilical-rpc.address</name>
> >> > > >>> >     <value>10-0-86-51:26001</value>
> >> > > >>> >   </property>
> >> > > >>> >   <property>
> >> > > >>> >     <name>tajo.master.client-rpc.address</name>
> >> > > >>> >     <value>10-0-86-51:26002</value>
> >> > > >>> >   </property>
> >> > > >>> >   <property>
> >> > > >>> >     <name>tajo.resource-tracker.rpc.address</name>
> >> > > >>> >     <value>10-0-86-51:26003</value>
> >> > > >>> >   </property>
> >> > > >>> >   <property>
> >> > > >>> >     <name>tajo.catalog.client-rpc.address</name>
> >> > > >>> >     <value>10-0-86-51:26005</value>
> >> > > >>> >   </property>
> >> > > >>> >   <property>
> >> > > >>> >     <name>tajo.worker.tmpdir.locations</name>
> >> > > >>> >     <value>/test/tajo1,/test/tajo2,/test/tajo3</value>
> >> > > >>> >   </property>
> >> > > >>> >   <!--  worker  -->
> >> > > >>> >   <property>
> >> > > >>> >
> >> >  <name>tajo.worker.resource.tajo.worker.resource.cpu-cores</name>
> >> > > >>> >     <value>4</value>
> >> > > >>> >   </property>
> >> > > >>> >  <property>
> >> > > >>> >    <name>tajo.worker.resource.memory-mb</name>
> >> > > >>> >    <value>5120</value>
> >> > > >>> >  </property>
> >> > > >>> >   <property>
> >> > > >>> >     <name>tajo.worker.resource.dfs-dir-aware</name>
> >> > > >>> >     <value>true</value>
> >> > > >>> >   </property>
> >> > > >>> >   <property>
> >> > > >>> >     <name>tajo.worker.resource.dedicated</name>
> >> > > >>> >     <value>true</value>
> >> > > >>> >   </property>
> >> > > >>> >   <property>
> >> > > >>> >     <name>tajo.worker.resource.dedicated-memory-ratio</name>
> >> > > >>> >     <value>0.6</value>
> >> > > >>> >   </property>
> >> > > >>> > </configuration>
> >> > > >>> >
> >> > > >>> > On Fri, Jan 16, 2015 at 2:50 PM, Jinho Kim <jh...@apache.org>
> >> > wrote:
> >> > > >>> >
> >> > > >>> > > Hello Azuyy yu
> >> > > >>> > >
> >> > > >>> > > I left some comments.
> >> > > >>> > >
> >> > > >>> > > -Jinho
> >> > > >>> > > Best regards
> >> > > >>> > >
> >> > > >>> > > 2015-01-16 14:37 GMT+09:00 Azuryy Yu <az...@gmail.com>:
> >> > > >>> > >
> >> > > >>> > > > Hi,
> >> > > >>> > > >
> >> > > >>> > > > I tested Tajo before half a year, then not focus on Tajo
> >> > because
> >> > > >>> some
> >> > > >>> > > other
> >> > > >>> > > > works.
> >> > > >>> > > >
> >> > > >>> > > > then I setup a small dev Tajo cluster this week.(six
> nodes,
> >> VM)
> >> > > >>> based
> >> > > >>> > on
> >> > > >>> > > > Hadoop-2.6.0.
> >> > > >>> > > >
> >> > > >>> > > > so my questions is:
> >> > > >>> > > >
> >> > > >>> > > > 1) From I know half a yea ago, Tajo is work on Yarn, using
> >> Yarn
> >> > > >>> > scheduler
> >> > > >>> > > > to manage  job resources. but now I found it doesn't rely
> on
> >> > > Yarn,
> >> > > >>> > > because
> >> > > >>> > > > I only start HDFS daemons, no yarn daemons. so Tajo has
> his
> >> own
> >> > > job
> >> > > >>> > > > sheduler ?
> >> > > >>> > > >
> >> > > >>> > > >
> >> > > >>> > > Now, tajo does using own task scheduler. and  You can start
> >> tajo
> >> > > >>> without
> >> > > >>> > > Yarn daemons
> >> > > >>> > > Please refer to
> >> > > http://tajo.apache.org/docs/0.9.0/configuration.html
> >> > > >>> > >
> >> > > >>> > >
> >> > > >>> > > >
> >> > > >>> > > > 2) Does that we need to put the file replications on every
> >> > nodes
> >> > > on
> >> > > >>> > Tajo
> >> > > >>> > > > cluster?
> >> > > >>> > > >
> >> > > >>> > >
> >> > > >>> > > No, tajo does not need more replication.  if you set more
> >> > > >>> replication,
> >> > > >>> > data
> >> > > >>> > > locality can be increased
> >> > > >>> > >
> >> > > >>> > > such as I have a six nodes Tajo cluster, then should I set
> HDFS
> >> > > block
> >> > > >>> > > > replication to six? because:
> >> > > >>> > > >
> >> > > >>> > > > I noticed when I run Tajo query, some nodes are busy, but
> >> some
> >> > is
> >> > > >>> free.
> >> > > >>> > > > because the file's blocks are only located on these nodes.
> >> non
> >> > > >>> others.
> >> > > >>> > > >
> >> > > >>> > > >
> >> > > >>> > > In my opinion, you need to run balancer
> >> > > >>> > >
> >> > > >>> > >
> >> > > >>> >
> >> > > >>>
> >> > >
> >> >
> >>
> http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/HDFSCommands.html#balancer
> >> > > >>> > >
> >> > > >>> > >
> >> > > >>> > > 3)the test data set is 4 million rows. nearly several GB.
> but
> >> > it's
> >> > > >>> very
> >> > > >>> > > > slow when I runing: select count(distinct ID) from ****;
> >> > > >>> > > > Any possible problems here?
> >> > > >>> > > >
> >> > > >>> > >
> >> > > >>> > > Could you share tajo-env.sh, tajo-site.xml ?
> >> > > >>> > >
> >> > > >>> > >
> >> > > >>> > > >
> >> > > >>> > > >
> >> > > >>> > > > Thanks
> >> > > >>> > > >
> >> > > >>> > >
> >> > > >>> >
> >> > > >>>
> >> > > >>
> >> > > >>
> >> > > >
> >> > >
> >> >
> >>
>

Re: Some Tajo-0.9.0 questions

Posted by Hyunsik Choi <hy...@apache.org>.

Hi Azuryy,

Tajo automatically rewrites distinct aggregation queries into
multi-level aggregations. The query rewrite that Jinho suggested may
be already involved.

I think that your query response times (12 ~ 15 secs) for distinct
count seems to be reasonable because just count aggregation takes 5
secs. Usually, distinct aggregation queries are much more slower than
just aggregation queries because distinct aggregation involves sort,
large intermediate data, and only distinct value handling.

In addition, I have a question for more better configuration guide.
Could you share available CPU, memory and disks for Tajo?

Even though Jinho suggested one, there is still room to set exact and
better configurations. Since the resource configuration determines the
number of concurrent tasks, it may be main cause of your performance
problem.

Best regards,
Hyunsik

On Sun, Jan 18, 2015 at 6:54 PM, Jinho Kim <jh...@apache.org> wrote:
>  Sorry for my mistake example query.
> Can you change to “select count(a.auid) from ( select auid from
> test_pl_00_0 group by auid ) a;” ?
>
> -Jinho
> Best regards
>
> 2015-01-19 11:44 GMT+09:00 Azuryy Yu <az...@gmail.com>:
>
>> Sorry for no response during weekend.
>> I changed hdfs-site.xml and restart hdfs and tajo.but  It's more slow than
>> before.
>>
>> default> select count(a.auid) from ( select auid from test_pl_00_0 ) a;
>> Progress: 0%, response time: 1.132 sec
>> Progress: 0%, response time: 1.134 sec
>> Progress: 0%, response time: 1.536 sec
>> Progress: 0%, response time: 2.338 sec
>> Progress: 0%, response time: 3.341 sec
>> Progress: 3%, response time: 4.343 sec
>> Progress: 4%, response time: 5.346 sec
>> Progress: 9%, response time: 6.35 sec
>> Progress: 11%, response time: 7.352 sec
>> Progress: 16%, response time: 8.354 sec
>> Progress: 18%, response time: 9.362 sec
>> Progress: 24%, response time: 10.364 sec
>> Progress: 27%, response time: 11.366 sec
>> Progress: 29%, response time: 12.368 sec
>> Progress: 32%, response time: 13.37 sec
>> Progress: 37%, response time: 14.373 sec
>> Progress: 40%, response time: 15.377 sec
>> Progress: 42%, response time: 16.379 sec
>> Progress: 42%, response time: 17.382 sec
>> Progress: 43%, response time: 18.384 sec
>> Progress: 43%, response time: 19.386 sec
>> Progress: 45%, response time: 20.388 sec
>> Progress: 45%, response time: 21.391 sec
>> Progress: 46%, response time: 22.393 sec
>> Progress: 46%, response time: 23.395 sec
>> Progress: 48%, response time: 24.398 sec
>> Progress: 48%, response time: 25.401 sec
>> Progress: 50%, response time: 26.403 sec
>> Progress: 100%, response time: 26.95 sec
>> ?count
>> -------------------------------
>> 4487999
>> (1 rows, 26.95 sec, 8 B selected)
>> default> select count(distinct auid) from test_pl_00_0;
>> Progress: 0%, response time: 0.88 sec
>> Progress: 0%, response time: 0.881 sec
>> Progress: 0%, response time: 1.283 sec
>> Progress: 0%, response time: 2.086 sec
>> Progress: 0%, response time: 3.088 sec
>> Progress: 0%, response time: 4.09 sec
>> Progress: 25%, response time: 5.092 sec
>> Progress: 33%, response time: 6.094 sec
>> Progress: 50%, response time: 7.096 sec
>> Progress: 50%, response time: 8.098 sec
>> Progress: 50%, response time: 9.099 sec
>> Progress: 66%, response time: 10.101 sec
>> Progress: 66%, response time: 11.103 sec
>> Progress: 83%, response time: 12.105 sec
>> Progress: 100%, response time: 12.268 sec
>> ?count
>> -------------------------------
>> 1222356
>> (1 rows, 12.268 sec, 8 B selected)
>>
>> On Sat, Jan 17, 2015 at 11:00 PM, Jinho Kim <jh...@apache.org> wrote:
>>
>> >  Thank you for your sharing
>> >
>> > Can you enable the dfs.datanode.hdfs-blocks-metadata.enabled in
>> > hdfs-site.xml ?
>> > If you enable the block-metadata, tajo-cluster can use the volume load
>> > balancing. You should restart the datanode and tajo cluster. I will
>> > investigate performance of count-distinct operator. and You can change to
>> > “select count(a.auid) from ( select auid from test_pl_00_0 ) a”
>> >
>> >
>> > -Jinho
>> > Best regards
>> >
>> > 2015-01-16 18:05 GMT+09:00 Azuryy Yu <az...@gmail.com>:
>> >
>> > > default> select count(*) from test_pl_00_0;
>> > > Progress: 0%, response time: 0.718 sec
>> > > Progress: 0%, response time: 0.72 sec
>> > > Progress: 0%, response time: 1.121 sec
>> > > Progress: 12%, response time: 1.923 sec
>> > > Progress: 28%, response time: 2.925 sec
>> > > Progress: 41%, response time: 3.927 sec
>> > > Progress: 50%, response time: 4.931 sec
>> > > Progress: 100%, response time: 5.323 sec
>> > > 2015-01-16T17:04:41.116+0800: [GC2015-01-16T17:04:41.116+0800: [ParNew:
>> > > 26543K->6211K(31488K), 0.0079770 secs] 26543K->6211K(115456K),
>> 0.0080700
>> > > secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
>> > > 2015-01-16T17:04:41.303+0800: [GC2015-01-16T17:04:41.303+0800: [ParNew:
>> > > 27203K->7185K(31488K), 0.0066950 secs] 27203K->7185K(115456K),
>> 0.0068130
>> > > secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
>> > > 2015-01-16T17:04:41.504+0800: [GC2015-01-16T17:04:41.504+0800: [ParNew:
>> > > 28177K->5597K(31488K), 0.0091630 secs] 28177K->6523K(115456K),
>> 0.0092430
>> > > secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
>> > > 2015-01-16T17:04:41.778+0800: [GC2015-01-16T17:04:41.778+0800: [ParNew:
>> > > 26589K->6837K(31488K), 0.0067280 secs] 27515K->7764K(115456K),
>> 0.0068160
>> > > secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
>> > > ?count
>> > > -------------------------------
>> > > 4487999
>> > > (1 rows, 5.323 sec, 8 B selected)
>> > >
>> > > On Fri, Jan 16, 2015 at 5:03 PM, Azuryy Yu <az...@gmail.com> wrote:
>> > >
>> > > > Hi,
>> > > > There is no big improvement, sometimes more slower than before. I
>> also
>> > > try
>> > > > to increase worker's heap size and parallel, nothing improve.
>> > > >
>> > > > default> select count(distinct auid) from test_pl_00_0;
>> > > > Progress: 0%, response time: 0.963 sec
>> > > > Progress: 0%, response time: 0.964 sec
>> > > > Progress: 0%, response time: 1.366 sec
>> > > > Progress: 0%, response time: 2.168 sec
>> > > > Progress: 0%, response time: 3.17 sec
>> > > > Progress: 0%, response time: 4.172 sec
>> > > > Progress: 16%, response time: 5.174 sec
>> > > > Progress: 16%, response time: 6.176 sec
>> > > > Progress: 16%, response time: 7.178 sec
>> > > > Progress: 33%, response time: 8.18 sec
>> > > > Progress: 50%, response time: 9.181 sec
>> > > > Progress: 50%, response time: 10.183 sec
>> > > > Progress: 50%, response time: 11.185 sec
>> > > > Progress: 50%, response time: 12.187 sec
>> > > > Progress: 66%, response time: 13.189 sec
>> > > > Progress: 66%, response time: 14.19 sec
>> > > > Progress: 100%, response time: 15.003 sec
>> > > > 2015-01-16T17:00:56.410+0800: [GC2015-01-16T17:00:56.410+0800:
>> [ParNew:
>> > > > 26473K->6582K(31488K), 0.0105030 secs] 26473K->6582K(115456K),
>> > 0.0105720
>> > > > secs] [Times: user=0.04 sys=0.00, real=0.01 secs]
>> > > > 2015-01-16T17:00:56.593+0800: [GC2015-01-16T17:00:56.593+0800:
>> [ParNew:
>> > > > 27574K->6469K(31488K), 0.0086300 secs] 27574K->6469K(115456K),
>> > 0.0086940
>> > > > secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
>> > > > 2015-01-16T17:00:56.800+0800: [GC2015-01-16T17:00:56.800+0800:
>> [ParNew:
>> > > > 27461K->5664K(31488K), 0.0122560 secs] 27461K->6591K(115456K),
>> > 0.0123210
>> > > > secs] [Times: user=0.02 sys=0.01, real=0.01 secs]
>> > > > 2015-01-16T17:00:57.065+0800: [GC2015-01-16T17:00:57.065+0800:
>> [ParNew:
>> > > > 26656K->6906K(31488K), 0.0070520 secs] 27583K->7833K(115456K),
>> > 0.0071470
>> > > > secs] [Times: user=0.03 sys=0.00, real=0.01 secs]
>> > > > ?count
>> > > > -------------------------------
>> > > > 1222356
>> > > > (1 rows, 15.003 sec, 8 B selected)
>> > > >
>> > > >
>> > > > On Fri, Jan 16, 2015 at 4:09 PM, Azuryy Yu <az...@gmail.com>
>> wrote:
>> > > >
>> > > >> Thanks Kim, I'll try and post back.
>> > > >>
>> > > >> On Fri, Jan 16, 2015 at 4:02 PM, Jinho Kim <jh...@apache.org>
>> wrote:
>> > > >>
>> > > >>> Thanks Azuryy Yu
>> > > >>>
>> > > >>> Your parallel running tasks of tajo-worker is 10 but heap memory is
>> > > 3GB.
>> > > >>> It
>> > > >>> cause a long JVM pause
>> > > >>> I recommend following :
>> > > >>>
>> > > >>> tajo-env.sh
>> > > >>> TAJO_WORKER_HEAPSIZE=3000 or more
>> > > >>>
>> > > >>> tajo-site.xml
>> > > >>> <!--  worker  -->
>> > > >>> <property>
>> > > >>>   <name>tajo.worker.resource.memory-mb</name>
>> > > >>>   <value>3512</value> <!--  3 tasks + 1 qm task  -->
>> > > >>> </property>
>> > > >>> <property>
>> > > >>>   <name>tajo.task.memory-slot-mb.default</name>
>> > > >>>   <value>1000</value> <!--  default 512 -->
>> > > >>> </property>
>> > > >>> <property>
>> > > >>>    <name>tajo.worker.resource.dfs-dir-aware</name>
>> > > >>>    <value>true</value>
>> > > >>> </property>
>> > > >>> <!--  end  -->
>> > > >>>
>> > >
>> >
>> http://tajo.apache.org/docs/0.9.0/configuration/worker_configuration.html
>> > > >>>
>> > > >>> -Jinho
>> > > >>> Best regards
>> > > >>>
>> > > >>> 2015-01-16 16:02 GMT+09:00 Azuryy Yu <az...@gmail.com>:
>> > > >>>
>> > > >>> > Thanks Kim.
>> > > >>> >
>> > > >>> > The following is my tajo-env and tajo-site
>> > > >>> >
>> > > >>> > *tajo-env.sh:*
>> > > >>> > export HADOOP_HOME=/usr/local/hadoop
>> > > >>> > export JAVA_HOME=/usr/local/java
>> > > >>> > _TAJO_OPTS="-server -verbose:gc
>> > > >>> >   -XX:+PrintGCDateStamps
>> > > >>> >   -XX:+PrintGCDetails
>> > > >>> >   -XX:+UseGCLogFileRotation
>> > > >>> >   -XX:NumberOfGCLogFiles=9
>> > > >>> >   -XX:GCLogFileSize=256m
>> > > >>> >   -XX:+DisableExplicitGC
>> > > >>> >   -XX:+UseCompressedOops
>> > > >>> >   -XX:SoftRefLRUPolicyMSPerMB=0
>> > > >>> >   -XX:+UseFastAccessorMethods
>> > > >>> >   -XX:+UseParNewGC
>> > > >>> >   -XX:+UseConcMarkSweepGC
>> > > >>> >   -XX:+CMSParallelRemarkEnabled
>> > > >>> >   -XX:CMSInitiatingOccupancyFraction=70
>> > > >>> >   -XX:+UseCMSCompactAtFullCollection
>> > > >>> >   -XX:CMSFullGCsBeforeCompaction=0
>> > > >>> >   -XX:+CMSClassUnloadingEnabled
>> > > >>> >   -XX:CMSMaxAbortablePrecleanTime=300
>> > > >>> >   -XX:+CMSScavengeBeforeRemark
>> > > >>> >   -XX:PermSize=160m
>> > > >>> >   -XX:GCTimeRatio=19
>> > > >>> >   -XX:SurvivorRatio=2
>> > > >>> >   -XX:MaxTenuringThreshold=60"
>> > > >>> > _TAJO_MASTER_OPTS="$_TAJO_OPTS -Xmx512m -Xms512m -Xmn256m"
>> > > >>> > _TAJO_WORKER_OPTS="$_TAJO_OPTS -Xmx3g -Xms3g -Xmn1g"
>> > > >>> > _TAJO_QUERYMASTER_OPTS="$_TAJO_OPTS -Xmx512m -Xms512m -Xmn256m"
>> > > >>> > export TAJO_OPTS=$_TAJO_OPTS
>> > > >>> > export TAJO_MASTER_OPTS=$_TAJO_MASTER_OPTS
>> > > >>> > export TAJO_WORKER_OPTS=$_TAJO_WORKER_OPTS
>> > > >>> > export TAJO_QUERYMASTER_OPTS=$_TAJO_QUERYMASTER_OPTS
>> > > >>> > export TAJO_LOG_DIR=${TAJO_HOME}/logs
>> > > >>> > export TAJO_PID_DIR=${TAJO_HOME}/pids
>> > > >>> > export TAJO_WORKER_STANDBY_MODE=true
>> > > >>> >
>> > > >>> > *tajo-site.xml:*
>> > > >>> >
>> > > >>> > <configuration>
>> > > >>> >   <property>
>> > > >>> >     <name>tajo.rootdir</name>
>> > > >>> >     <value>hdfs://test-cluster/tajo</value>
>> > > >>> >   </property>
>> > > >>> >   <property>
>> > > >>> >     <name>tajo.master.umbilical-rpc.address</name>
>> > > >>> >     <value>10-0-86-51:26001</value>
>> > > >>> >   </property>
>> > > >>> >   <property>
>> > > >>> >     <name>tajo.master.client-rpc.address</name>
>> > > >>> >     <value>10-0-86-51:26002</value>
>> > > >>> >   </property>
>> > > >>> >   <property>
>> > > >>> >     <name>tajo.resource-tracker.rpc.address</name>
>> > > >>> >     <value>10-0-86-51:26003</value>
>> > > >>> >   </property>
>> > > >>> >   <property>
>> > > >>> >     <name>tajo.catalog.client-rpc.address</name>
>> > > >>> >     <value>10-0-86-51:26005</value>
>> > > >>> >   </property>
>> > > >>> >   <property>
>> > > >>> >     <name>tajo.worker.tmpdir.locations</name>
>> > > >>> >     <value>/test/tajo1,/test/tajo2,/test/tajo3</value>
>> > > >>> >   </property>
>> > > >>> >   <!--  worker  -->
>> > > >>> >   <property>
>> > > >>> >
>> >  <name>tajo.worker.resource.tajo.worker.resource.cpu-cores</name>
>> > > >>> >     <value>4</value>
>> > > >>> >   </property>
>> > > >>> >  <property>
>> > > >>> >    <name>tajo.worker.resource.memory-mb</name>
>> > > >>> >    <value>5120</value>
>> > > >>> >  </property>
>> > > >>> >   <property>
>> > > >>> >     <name>tajo.worker.resource.dfs-dir-aware</name>
>> > > >>> >     <value>true</value>
>> > > >>> >   </property>
>> > > >>> >   <property>
>> > > >>> >     <name>tajo.worker.resource.dedicated</name>
>> > > >>> >     <value>true</value>
>> > > >>> >   </property>
>> > > >>> >   <property>
>> > > >>> >     <name>tajo.worker.resource.dedicated-memory-ratio</name>
>> > > >>> >     <value>0.6</value>
>> > > >>> >   </property>
>> > > >>> > </configuration>
>> > > >>> >
>> > > >>> > On Fri, Jan 16, 2015 at 2:50 PM, Jinho Kim <jh...@apache.org>
>> > wrote:
>> > > >>> >
>> > > >>> > > Hello Azuyy yu
>> > > >>> > >
>> > > >>> > > I left some comments.
>> > > >>> > >
>> > > >>> > > -Jinho
>> > > >>> > > Best regards
>> > > >>> > >
>> > > >>> > > 2015-01-16 14:37 GMT+09:00 Azuryy Yu <az...@gmail.com>:
>> > > >>> > >
>> > > >>> > > > Hi,
>> > > >>> > > >
>> > > >>> > > > I tested Tajo before half a year, then not focus on Tajo
>> > because
>> > > >>> some
>> > > >>> > > other
>> > > >>> > > > works.
>> > > >>> > > >
>> > > >>> > > > then I setup a small dev Tajo cluster this week.(six nodes,
>> VM)
>> > > >>> based
>> > > >>> > on
>> > > >>> > > > Hadoop-2.6.0.
>> > > >>> > > >
>> > > >>> > > > so my questions is:
>> > > >>> > > >
>> > > >>> > > > 1) From I know half a yea ago, Tajo is work on Yarn, using
>> Yarn
>> > > >>> > scheduler
>> > > >>> > > > to manage  job resources. but now I found it doesn't rely on
>> > > Yarn,
>> > > >>> > > because
>> > > >>> > > > I only start HDFS daemons, no yarn daemons. so Tajo has his
>> own
>> > > job
>> > > >>> > > > sheduler ?
>> > > >>> > > >
>> > > >>> > > >
>> > > >>> > > Now, tajo does using own task scheduler. and  You can start
>> tajo
>> > > >>> without
>> > > >>> > > Yarn daemons
>> > > >>> > > Please refer to
>> > > http://tajo.apache.org/docs/0.9.0/configuration.html
>> > > >>> > >
>> > > >>> > >
>> > > >>> > > >
>> > > >>> > > > 2) Does that we need to put the file replications on every
>> > nodes
>> > > on
>> > > >>> > Tajo
>> > > >>> > > > cluster?
>> > > >>> > > >
>> > > >>> > >
>> > > >>> > > No, tajo does not need more replication.  if you set more
>> > > >>> replication,
>> > > >>> > data
>> > > >>> > > locality can be increased
>> > > >>> > >
>> > > >>> > > such as I have a six nodes Tajo cluster, then should I set HDFS
>> > > block
>> > > >>> > > > replication to six? because:
>> > > >>> > > >
>> > > >>> > > > I noticed when I run Tajo query, some nodes are busy, but
>> some
>> > is
>> > > >>> free.
>> > > >>> > > > because the file's blocks are only located on these nodes.
>> non
>> > > >>> others.
>> > > >>> > > >
>> > > >>> > > >
>> > > >>> > > In my opinion, you need to run balancer
>> > > >>> > >
>> > > >>> > >
>> > > >>> >
>> > > >>>
>> > >
>> >
>> http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/HDFSCommands.html#balancer
>> > > >>> > >
>> > > >>> > >
>> > > >>> > > 3)the test data set is 4 million rows. nearly several GB. but
>> > it's
>> > > >>> very
>> > > >>> > > > slow when I runing: select count(distinct ID) from ****;
>> > > >>> > > > Any possible problems here?
>> > > >>> > > >
>> > > >>> > >
>> > > >>> > > Could you share tajo-env.sh, tajo-site.xml ?
>> > > >>> > >
>> > > >>> > >
>> > > >>> > > >
>> > > >>> > > >
>> > > >>> > > > Thanks
>> > > >>> > > >
>> > > >>> > >
>> > > >>> >
>> > > >>>
>> > > >>
>> > > >>
>> > > >
>> > >
>> >
>>

Re: Some Tajo-0.9.0 questions

Posted by Jinho Kim <jh...@apache.org>.

 Sorry for my mistake example query.
Can you change to “select count(a.auid) from ( select auid from
test_pl_00_0 group by auid ) a;” ?

-Jinho
Best regards

2015-01-19 11:44 GMT+09:00 Azuryy Yu <az...@gmail.com>:

> Sorry for no response during weekend.
> I changed hdfs-site.xml and restart hdfs and tajo.but  It's more slow than
> before.
>
> default> select count(a.auid) from ( select auid from test_pl_00_0 ) a;
> Progress: 0%, response time: 1.132 sec
> Progress: 0%, response time: 1.134 sec
> Progress: 0%, response time: 1.536 sec
> Progress: 0%, response time: 2.338 sec
> Progress: 0%, response time: 3.341 sec
> Progress: 3%, response time: 4.343 sec
> Progress: 4%, response time: 5.346 sec
> Progress: 9%, response time: 6.35 sec
> Progress: 11%, response time: 7.352 sec
> Progress: 16%, response time: 8.354 sec
> Progress: 18%, response time: 9.362 sec
> Progress: 24%, response time: 10.364 sec
> Progress: 27%, response time: 11.366 sec
> Progress: 29%, response time: 12.368 sec
> Progress: 32%, response time: 13.37 sec
> Progress: 37%, response time: 14.373 sec
> Progress: 40%, response time: 15.377 sec
> Progress: 42%, response time: 16.379 sec
> Progress: 42%, response time: 17.382 sec
> Progress: 43%, response time: 18.384 sec
> Progress: 43%, response time: 19.386 sec
> Progress: 45%, response time: 20.388 sec
> Progress: 45%, response time: 21.391 sec
> Progress: 46%, response time: 22.393 sec
> Progress: 46%, response time: 23.395 sec
> Progress: 48%, response time: 24.398 sec
> Progress: 48%, response time: 25.401 sec
> Progress: 50%, response time: 26.403 sec
> Progress: 100%, response time: 26.95 sec
> ?count
> -------------------------------
> 4487999
> (1 rows, 26.95 sec, 8 B selected)
> default> select count(distinct auid) from test_pl_00_0;
> Progress: 0%, response time: 0.88 sec
> Progress: 0%, response time: 0.881 sec
> Progress: 0%, response time: 1.283 sec
> Progress: 0%, response time: 2.086 sec
> Progress: 0%, response time: 3.088 sec
> Progress: 0%, response time: 4.09 sec
> Progress: 25%, response time: 5.092 sec
> Progress: 33%, response time: 6.094 sec
> Progress: 50%, response time: 7.096 sec
> Progress: 50%, response time: 8.098 sec
> Progress: 50%, response time: 9.099 sec
> Progress: 66%, response time: 10.101 sec
> Progress: 66%, response time: 11.103 sec
> Progress: 83%, response time: 12.105 sec
> Progress: 100%, response time: 12.268 sec
> ?count
> -------------------------------
> 1222356
> (1 rows, 12.268 sec, 8 B selected)
>
> On Sat, Jan 17, 2015 at 11:00 PM, Jinho Kim <jh...@apache.org> wrote:
>
> >  Thank you for your sharing
> >
> > Can you enable the dfs.datanode.hdfs-blocks-metadata.enabled in
> > hdfs-site.xml ?
> > If you enable the block-metadata, tajo-cluster can use the volume load
> > balancing. You should restart the datanode and tajo cluster. I will
> > investigate performance of count-distinct operator. and You can change to
> > “select count(a.auid) from ( select auid from test_pl_00_0 ) a”
> >
> >
> > -Jinho
> > Best regards
> >
> > 2015-01-16 18:05 GMT+09:00 Azuryy Yu <az...@gmail.com>:
> >
> > > default> select count(*) from test_pl_00_0;
> > > Progress: 0%, response time: 0.718 sec
> > > Progress: 0%, response time: 0.72 sec
> > > Progress: 0%, response time: 1.121 sec
> > > Progress: 12%, response time: 1.923 sec
> > > Progress: 28%, response time: 2.925 sec
> > > Progress: 41%, response time: 3.927 sec
> > > Progress: 50%, response time: 4.931 sec
> > > Progress: 100%, response time: 5.323 sec
> > > 2015-01-16T17:04:41.116+0800: [GC2015-01-16T17:04:41.116+0800: [ParNew:
> > > 26543K->6211K(31488K), 0.0079770 secs] 26543K->6211K(115456K),
> 0.0080700
> > > secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
> > > 2015-01-16T17:04:41.303+0800: [GC2015-01-16T17:04:41.303+0800: [ParNew:
> > > 27203K->7185K(31488K), 0.0066950 secs] 27203K->7185K(115456K),
> 0.0068130
> > > secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
> > > 2015-01-16T17:04:41.504+0800: [GC2015-01-16T17:04:41.504+0800: [ParNew:
> > > 28177K->5597K(31488K), 0.0091630 secs] 28177K->6523K(115456K),
> 0.0092430
> > > secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
> > > 2015-01-16T17:04:41.778+0800: [GC2015-01-16T17:04:41.778+0800: [ParNew:
> > > 26589K->6837K(31488K), 0.0067280 secs] 27515K->7764K(115456K),
> 0.0068160
> > > secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
> > > ?count
> > > -------------------------------
> > > 4487999
> > > (1 rows, 5.323 sec, 8 B selected)
> > >
> > > On Fri, Jan 16, 2015 at 5:03 PM, Azuryy Yu <az...@gmail.com> wrote:
> > >
> > > > Hi,
> > > > There is no big improvement, sometimes more slower than before. I
> also
> > > try
> > > > to increase worker's heap size and parallel, nothing improve.
> > > >
> > > > default> select count(distinct auid) from test_pl_00_0;
> > > > Progress: 0%, response time: 0.963 sec
> > > > Progress: 0%, response time: 0.964 sec
> > > > Progress: 0%, response time: 1.366 sec
> > > > Progress: 0%, response time: 2.168 sec
> > > > Progress: 0%, response time: 3.17 sec
> > > > Progress: 0%, response time: 4.172 sec
> > > > Progress: 16%, response time: 5.174 sec
> > > > Progress: 16%, response time: 6.176 sec
> > > > Progress: 16%, response time: 7.178 sec
> > > > Progress: 33%, response time: 8.18 sec
> > > > Progress: 50%, response time: 9.181 sec
> > > > Progress: 50%, response time: 10.183 sec
> > > > Progress: 50%, response time: 11.185 sec
> > > > Progress: 50%, response time: 12.187 sec
> > > > Progress: 66%, response time: 13.189 sec
> > > > Progress: 66%, response time: 14.19 sec
> > > > Progress: 100%, response time: 15.003 sec
> > > > 2015-01-16T17:00:56.410+0800: [GC2015-01-16T17:00:56.410+0800:
> [ParNew:
> > > > 26473K->6582K(31488K), 0.0105030 secs] 26473K->6582K(115456K),
> > 0.0105720
> > > > secs] [Times: user=0.04 sys=0.00, real=0.01 secs]
> > > > 2015-01-16T17:00:56.593+0800: [GC2015-01-16T17:00:56.593+0800:
> [ParNew:
> > > > 27574K->6469K(31488K), 0.0086300 secs] 27574K->6469K(115456K),
> > 0.0086940
> > > > secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
> > > > 2015-01-16T17:00:56.800+0800: [GC2015-01-16T17:00:56.800+0800:
> [ParNew:
> > > > 27461K->5664K(31488K), 0.0122560 secs] 27461K->6591K(115456K),
> > 0.0123210
> > > > secs] [Times: user=0.02 sys=0.01, real=0.01 secs]
> > > > 2015-01-16T17:00:57.065+0800: [GC2015-01-16T17:00:57.065+0800:
> [ParNew:
> > > > 26656K->6906K(31488K), 0.0070520 secs] 27583K->7833K(115456K),
> > 0.0071470
> > > > secs] [Times: user=0.03 sys=0.00, real=0.01 secs]
> > > > ?count
> > > > -------------------------------
> > > > 1222356
> > > > (1 rows, 15.003 sec, 8 B selected)
> > > >
> > > >
> > > > On Fri, Jan 16, 2015 at 4:09 PM, Azuryy Yu <az...@gmail.com>
> wrote:
> > > >
> > > >> Thanks Kim, I'll try and post back.
> > > >>
> > > >> On Fri, Jan 16, 2015 at 4:02 PM, Jinho Kim <jh...@apache.org>
> wrote:
> > > >>
> > > >>> Thanks Azuryy Yu
> > > >>>
> > > >>> Your parallel running tasks of tajo-worker is 10 but heap memory is
> > > 3GB.
> > > >>> It
> > > >>> cause a long JVM pause
> > > >>> I recommend following :
> > > >>>
> > > >>> tajo-env.sh
> > > >>> TAJO_WORKER_HEAPSIZE=3000 or more
> > > >>>
> > > >>> tajo-site.xml
> > > >>> <!--  worker  -->
> > > >>> <property>
> > > >>>   <name>tajo.worker.resource.memory-mb</name>
> > > >>>   <value>3512</value> <!--  3 tasks + 1 qm task  -->
> > > >>> </property>
> > > >>> <property>
> > > >>>   <name>tajo.task.memory-slot-mb.default</name>
> > > >>>   <value>1000</value> <!--  default 512 -->
> > > >>> </property>
> > > >>> <property>
> > > >>>    <name>tajo.worker.resource.dfs-dir-aware</name>
> > > >>>    <value>true</value>
> > > >>> </property>
> > > >>> <!--  end  -->
> > > >>>
> > >
> >
> http://tajo.apache.org/docs/0.9.0/configuration/worker_configuration.html
> > > >>>
> > > >>> -Jinho
> > > >>> Best regards
> > > >>>
> > > >>> 2015-01-16 16:02 GMT+09:00 Azuryy Yu <az...@gmail.com>:
> > > >>>
> > > >>> > Thanks Kim.
> > > >>> >
> > > >>> > The following is my tajo-env and tajo-site
> > > >>> >
> > > >>> > *tajo-env.sh:*
> > > >>> > export HADOOP_HOME=/usr/local/hadoop
> > > >>> > export JAVA_HOME=/usr/local/java
> > > >>> > _TAJO_OPTS="-server -verbose:gc
> > > >>> >   -XX:+PrintGCDateStamps
> > > >>> >   -XX:+PrintGCDetails
> > > >>> >   -XX:+UseGCLogFileRotation
> > > >>> >   -XX:NumberOfGCLogFiles=9
> > > >>> >   -XX:GCLogFileSize=256m
> > > >>> >   -XX:+DisableExplicitGC
> > > >>> >   -XX:+UseCompressedOops
> > > >>> >   -XX:SoftRefLRUPolicyMSPerMB=0
> > > >>> >   -XX:+UseFastAccessorMethods
> > > >>> >   -XX:+UseParNewGC
> > > >>> >   -XX:+UseConcMarkSweepGC
> > > >>> >   -XX:+CMSParallelRemarkEnabled
> > > >>> >   -XX:CMSInitiatingOccupancyFraction=70
> > > >>> >   -XX:+UseCMSCompactAtFullCollection
> > > >>> >   -XX:CMSFullGCsBeforeCompaction=0
> > > >>> >   -XX:+CMSClassUnloadingEnabled
> > > >>> >   -XX:CMSMaxAbortablePrecleanTime=300
> > > >>> >   -XX:+CMSScavengeBeforeRemark
> > > >>> >   -XX:PermSize=160m
> > > >>> >   -XX:GCTimeRatio=19
> > > >>> >   -XX:SurvivorRatio=2
> > > >>> >   -XX:MaxTenuringThreshold=60"
> > > >>> > _TAJO_MASTER_OPTS="$_TAJO_OPTS -Xmx512m -Xms512m -Xmn256m"
> > > >>> > _TAJO_WORKER_OPTS="$_TAJO_OPTS -Xmx3g -Xms3g -Xmn1g"
> > > >>> > _TAJO_QUERYMASTER_OPTS="$_TAJO_OPTS -Xmx512m -Xms512m -Xmn256m"
> > > >>> > export TAJO_OPTS=$_TAJO_OPTS
> > > >>> > export TAJO_MASTER_OPTS=$_TAJO_MASTER_OPTS
> > > >>> > export TAJO_WORKER_OPTS=$_TAJO_WORKER_OPTS
> > > >>> > export TAJO_QUERYMASTER_OPTS=$_TAJO_QUERYMASTER_OPTS
> > > >>> > export TAJO_LOG_DIR=${TAJO_HOME}/logs
> > > >>> > export TAJO_PID_DIR=${TAJO_HOME}/pids
> > > >>> > export TAJO_WORKER_STANDBY_MODE=true
> > > >>> >
> > > >>> > *tajo-site.xml:*
> > > >>> >
> > > >>> > <configuration>
> > > >>> >   <property>
> > > >>> >     <name>tajo.rootdir</name>
> > > >>> >     <value>hdfs://test-cluster/tajo</value>
> > > >>> >   </property>
> > > >>> >   <property>
> > > >>> >     <name>tajo.master.umbilical-rpc.address</name>
> > > >>> >     <value>10-0-86-51:26001</value>
> > > >>> >   </property>
> > > >>> >   <property>
> > > >>> >     <name>tajo.master.client-rpc.address</name>
> > > >>> >     <value>10-0-86-51:26002</value>
> > > >>> >   </property>
> > > >>> >   <property>
> > > >>> >     <name>tajo.resource-tracker.rpc.address</name>
> > > >>> >     <value>10-0-86-51:26003</value>
> > > >>> >   </property>
> > > >>> >   <property>
> > > >>> >     <name>tajo.catalog.client-rpc.address</name>
> > > >>> >     <value>10-0-86-51:26005</value>
> > > >>> >   </property>
> > > >>> >   <property>
> > > >>> >     <name>tajo.worker.tmpdir.locations</name>
> > > >>> >     <value>/test/tajo1,/test/tajo2,/test/tajo3</value>
> > > >>> >   </property>
> > > >>> >   <!--  worker  -->
> > > >>> >   <property>
> > > >>> >
> >  <name>tajo.worker.resource.tajo.worker.resource.cpu-cores</name>
> > > >>> >     <value>4</value>
> > > >>> >   </property>
> > > >>> >  <property>
> > > >>> >    <name>tajo.worker.resource.memory-mb</name>
> > > >>> >    <value>5120</value>
> > > >>> >  </property>
> > > >>> >   <property>
> > > >>> >     <name>tajo.worker.resource.dfs-dir-aware</name>
> > > >>> >     <value>true</value>
> > > >>> >   </property>
> > > >>> >   <property>
> > > >>> >     <name>tajo.worker.resource.dedicated</name>
> > > >>> >     <value>true</value>
> > > >>> >   </property>
> > > >>> >   <property>
> > > >>> >     <name>tajo.worker.resource.dedicated-memory-ratio</name>
> > > >>> >     <value>0.6</value>
> > > >>> >   </property>
> > > >>> > </configuration>
> > > >>> >
> > > >>> > On Fri, Jan 16, 2015 at 2:50 PM, Jinho Kim <jh...@apache.org>
> > wrote:
> > > >>> >
> > > >>> > > Hello Azuyy yu
> > > >>> > >
> > > >>> > > I left some comments.
> > > >>> > >
> > > >>> > > -Jinho
> > > >>> > > Best regards
> > > >>> > >
> > > >>> > > 2015-01-16 14:37 GMT+09:00 Azuryy Yu <az...@gmail.com>:
> > > >>> > >
> > > >>> > > > Hi,
> > > >>> > > >
> > > >>> > > > I tested Tajo before half a year, then not focus on Tajo
> > because
> > > >>> some
> > > >>> > > other
> > > >>> > > > works.
> > > >>> > > >
> > > >>> > > > then I setup a small dev Tajo cluster this week.(six nodes,
> VM)
> > > >>> based
> > > >>> > on
> > > >>> > > > Hadoop-2.6.0.
> > > >>> > > >
> > > >>> > > > so my questions is:
> > > >>> > > >
> > > >>> > > > 1) From I know half a yea ago, Tajo is work on Yarn, using
> Yarn
> > > >>> > scheduler
> > > >>> > > > to manage  job resources. but now I found it doesn't rely on
> > > Yarn,
> > > >>> > > because
> > > >>> > > > I only start HDFS daemons, no yarn daemons. so Tajo has his
> own
> > > job
> > > >>> > > > sheduler ?
> > > >>> > > >
> > > >>> > > >
> > > >>> > > Now, tajo does using own task scheduler. and  You can start
> tajo
> > > >>> without
> > > >>> > > Yarn daemons
> > > >>> > > Please refer to
> > > http://tajo.apache.org/docs/0.9.0/configuration.html
> > > >>> > >
> > > >>> > >
> > > >>> > > >
> > > >>> > > > 2) Does that we need to put the file replications on every
> > nodes
> > > on
> > > >>> > Tajo
> > > >>> > > > cluster?
> > > >>> > > >
> > > >>> > >
> > > >>> > > No, tajo does not need more replication.  if you set more
> > > >>> replication,
> > > >>> > data
> > > >>> > > locality can be increased
> > > >>> > >
> > > >>> > > such as I have a six nodes Tajo cluster, then should I set HDFS
> > > block
> > > >>> > > > replication to six? because:
> > > >>> > > >
> > > >>> > > > I noticed when I run Tajo query, some nodes are busy, but
> some
> > is
> > > >>> free.
> > > >>> > > > because the file's blocks are only located on these nodes.
> non
> > > >>> others.
> > > >>> > > >
> > > >>> > > >
> > > >>> > > In my opinion, you need to run balancer
> > > >>> > >
> > > >>> > >
> > > >>> >
> > > >>>
> > >
> >
> http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/HDFSCommands.html#balancer
> > > >>> > >
> > > >>> > >
> > > >>> > > 3)the test data set is 4 million rows. nearly several GB. but
> > it's
> > > >>> very
> > > >>> > > > slow when I runing: select count(distinct ID) from ****;
> > > >>> > > > Any possible problems here?
> > > >>> > > >
> > > >>> > >
> > > >>> > > Could you share tajo-env.sh, tajo-site.xml ?
> > > >>> > >
> > > >>> > >
> > > >>> > > >
> > > >>> > > >
> > > >>> > > > Thanks
> > > >>> > > >
> > > >>> > >
> > > >>> >
> > > >>>
> > > >>
> > > >>
> > > >
> > >
> >
>

Re: Some Tajo-0.9.0 questions

Posted by Azuryy Yu <az...@gmail.com>.

Sorry for no response during weekend.
I changed hdfs-site.xml and restart hdfs and tajo.but  It's more slow than
before.

default> select count(a.auid) from ( select auid from test_pl_00_0 ) a;
Progress: 0%, response time: 1.132 sec
Progress: 0%, response time: 1.134 sec
Progress: 0%, response time: 1.536 sec
Progress: 0%, response time: 2.338 sec
Progress: 0%, response time: 3.341 sec
Progress: 3%, response time: 4.343 sec
Progress: 4%, response time: 5.346 sec
Progress: 9%, response time: 6.35 sec
Progress: 11%, response time: 7.352 sec
Progress: 16%, response time: 8.354 sec
Progress: 18%, response time: 9.362 sec
Progress: 24%, response time: 10.364 sec
Progress: 27%, response time: 11.366 sec
Progress: 29%, response time: 12.368 sec
Progress: 32%, response time: 13.37 sec
Progress: 37%, response time: 14.373 sec
Progress: 40%, response time: 15.377 sec
Progress: 42%, response time: 16.379 sec
Progress: 42%, response time: 17.382 sec
Progress: 43%, response time: 18.384 sec
Progress: 43%, response time: 19.386 sec
Progress: 45%, response time: 20.388 sec
Progress: 45%, response time: 21.391 sec
Progress: 46%, response time: 22.393 sec
Progress: 46%, response time: 23.395 sec
Progress: 48%, response time: 24.398 sec
Progress: 48%, response time: 25.401 sec
Progress: 50%, response time: 26.403 sec
Progress: 100%, response time: 26.95 sec
?count
-------------------------------
4487999
(1 rows, 26.95 sec, 8 B selected)
default> select count(distinct auid) from test_pl_00_0;
Progress: 0%, response time: 0.88 sec
Progress: 0%, response time: 0.881 sec
Progress: 0%, response time: 1.283 sec
Progress: 0%, response time: 2.086 sec
Progress: 0%, response time: 3.088 sec
Progress: 0%, response time: 4.09 sec
Progress: 25%, response time: 5.092 sec
Progress: 33%, response time: 6.094 sec
Progress: 50%, response time: 7.096 sec
Progress: 50%, response time: 8.098 sec
Progress: 50%, response time: 9.099 sec
Progress: 66%, response time: 10.101 sec
Progress: 66%, response time: 11.103 sec
Progress: 83%, response time: 12.105 sec
Progress: 100%, response time: 12.268 sec
?count
-------------------------------
1222356
(1 rows, 12.268 sec, 8 B selected)

On Sat, Jan 17, 2015 at 11:00 PM, Jinho Kim <jh...@apache.org> wrote:

>  Thank you for your sharing
>
> Can you enable the dfs.datanode.hdfs-blocks-metadata.enabled in
> hdfs-site.xml ?
> If you enable the block-metadata, tajo-cluster can use the volume load
> balancing. You should restart the datanode and tajo cluster. I will
> investigate performance of count-distinct operator. and You can change to
> “select count(a.auid) from ( select auid from test_pl_00_0 ) a”
>
>
> -Jinho
> Best regards
>
> 2015-01-16 18:05 GMT+09:00 Azuryy Yu <az...@gmail.com>:
>
> > default> select count(*) from test_pl_00_0;
> > Progress: 0%, response time: 0.718 sec
> > Progress: 0%, response time: 0.72 sec
> > Progress: 0%, response time: 1.121 sec
> > Progress: 12%, response time: 1.923 sec
> > Progress: 28%, response time: 2.925 sec
> > Progress: 41%, response time: 3.927 sec
> > Progress: 50%, response time: 4.931 sec
> > Progress: 100%, response time: 5.323 sec
> > 2015-01-16T17:04:41.116+0800: [GC2015-01-16T17:04:41.116+0800: [ParNew:
> > 26543K->6211K(31488K), 0.0079770 secs] 26543K->6211K(115456K), 0.0080700
> > secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
> > 2015-01-16T17:04:41.303+0800: [GC2015-01-16T17:04:41.303+0800: [ParNew:
> > 27203K->7185K(31488K), 0.0066950 secs] 27203K->7185K(115456K), 0.0068130
> > secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
> > 2015-01-16T17:04:41.504+0800: [GC2015-01-16T17:04:41.504+0800: [ParNew:
> > 28177K->5597K(31488K), 0.0091630 secs] 28177K->6523K(115456K), 0.0092430
> > secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
> > 2015-01-16T17:04:41.778+0800: [GC2015-01-16T17:04:41.778+0800: [ParNew:
> > 26589K->6837K(31488K), 0.0067280 secs] 27515K->7764K(115456K), 0.0068160
> > secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
> > ?count
> > -------------------------------
> > 4487999
> > (1 rows, 5.323 sec, 8 B selected)
> >
> > On Fri, Jan 16, 2015 at 5:03 PM, Azuryy Yu <az...@gmail.com> wrote:
> >
> > > Hi,
> > > There is no big improvement, sometimes more slower than before. I also
> > try
> > > to increase worker's heap size and parallel, nothing improve.
> > >
> > > default> select count(distinct auid) from test_pl_00_0;
> > > Progress: 0%, response time: 0.963 sec
> > > Progress: 0%, response time: 0.964 sec
> > > Progress: 0%, response time: 1.366 sec
> > > Progress: 0%, response time: 2.168 sec
> > > Progress: 0%, response time: 3.17 sec
> > > Progress: 0%, response time: 4.172 sec
> > > Progress: 16%, response time: 5.174 sec
> > > Progress: 16%, response time: 6.176 sec
> > > Progress: 16%, response time: 7.178 sec
> > > Progress: 33%, response time: 8.18 sec
> > > Progress: 50%, response time: 9.181 sec
> > > Progress: 50%, response time: 10.183 sec
> > > Progress: 50%, response time: 11.185 sec
> > > Progress: 50%, response time: 12.187 sec
> > > Progress: 66%, response time: 13.189 sec
> > > Progress: 66%, response time: 14.19 sec
> > > Progress: 100%, response time: 15.003 sec
> > > 2015-01-16T17:00:56.410+0800: [GC2015-01-16T17:00:56.410+0800: [ParNew:
> > > 26473K->6582K(31488K), 0.0105030 secs] 26473K->6582K(115456K),
> 0.0105720
> > > secs] [Times: user=0.04 sys=0.00, real=0.01 secs]
> > > 2015-01-16T17:00:56.593+0800: [GC2015-01-16T17:00:56.593+0800: [ParNew:
> > > 27574K->6469K(31488K), 0.0086300 secs] 27574K->6469K(115456K),
> 0.0086940
> > > secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
> > > 2015-01-16T17:00:56.800+0800: [GC2015-01-16T17:00:56.800+0800: [ParNew:
> > > 27461K->5664K(31488K), 0.0122560 secs] 27461K->6591K(115456K),
> 0.0123210
> > > secs] [Times: user=0.02 sys=0.01, real=0.01 secs]
> > > 2015-01-16T17:00:57.065+0800: [GC2015-01-16T17:00:57.065+0800: [ParNew:
> > > 26656K->6906K(31488K), 0.0070520 secs] 27583K->7833K(115456K),
> 0.0071470
> > > secs] [Times: user=0.03 sys=0.00, real=0.01 secs]
> > > ?count
> > > -------------------------------
> > > 1222356
> > > (1 rows, 15.003 sec, 8 B selected)
> > >
> > >
> > > On Fri, Jan 16, 2015 at 4:09 PM, Azuryy Yu <az...@gmail.com> wrote:
> > >
> > >> Thanks Kim, I'll try and post back.
> > >>
> > >> On Fri, Jan 16, 2015 at 4:02 PM, Jinho Kim <jh...@apache.org> wrote:
> > >>
> > >>> Thanks Azuryy Yu
> > >>>
> > >>> Your parallel running tasks of tajo-worker is 10 but heap memory is
> > 3GB.
> > >>> It
> > >>> cause a long JVM pause
> > >>> I recommend following :
> > >>>
> > >>> tajo-env.sh
> > >>> TAJO_WORKER_HEAPSIZE=3000 or more
> > >>>
> > >>> tajo-site.xml
> > >>> <!--  worker  -->
> > >>> <property>
> > >>>   <name>tajo.worker.resource.memory-mb</name>
> > >>>   <value>3512</value> <!--  3 tasks + 1 qm task  -->
> > >>> </property>
> > >>> <property>
> > >>>   <name>tajo.task.memory-slot-mb.default</name>
> > >>>   <value>1000</value> <!--  default 512 -->
> > >>> </property>
> > >>> <property>
> > >>>    <name>tajo.worker.resource.dfs-dir-aware</name>
> > >>>    <value>true</value>
> > >>> </property>
> > >>> <!--  end  -->
> > >>>
> >
> http://tajo.apache.org/docs/0.9.0/configuration/worker_configuration.html
> > >>>
> > >>> -Jinho
> > >>> Best regards
> > >>>
> > >>> 2015-01-16 16:02 GMT+09:00 Azuryy Yu <az...@gmail.com>:
> > >>>
> > >>> > Thanks Kim.
> > >>> >
> > >>> > The following is my tajo-env and tajo-site
> > >>> >
> > >>> > *tajo-env.sh:*
> > >>> > export HADOOP_HOME=/usr/local/hadoop
> > >>> > export JAVA_HOME=/usr/local/java
> > >>> > _TAJO_OPTS="-server -verbose:gc
> > >>> >   -XX:+PrintGCDateStamps
> > >>> >   -XX:+PrintGCDetails
> > >>> >   -XX:+UseGCLogFileRotation
> > >>> >   -XX:NumberOfGCLogFiles=9
> > >>> >   -XX:GCLogFileSize=256m
> > >>> >   -XX:+DisableExplicitGC
> > >>> >   -XX:+UseCompressedOops
> > >>> >   -XX:SoftRefLRUPolicyMSPerMB=0
> > >>> >   -XX:+UseFastAccessorMethods
> > >>> >   -XX:+UseParNewGC
> > >>> >   -XX:+UseConcMarkSweepGC
> > >>> >   -XX:+CMSParallelRemarkEnabled
> > >>> >   -XX:CMSInitiatingOccupancyFraction=70
> > >>> >   -XX:+UseCMSCompactAtFullCollection
> > >>> >   -XX:CMSFullGCsBeforeCompaction=0
> > >>> >   -XX:+CMSClassUnloadingEnabled
> > >>> >   -XX:CMSMaxAbortablePrecleanTime=300
> > >>> >   -XX:+CMSScavengeBeforeRemark
> > >>> >   -XX:PermSize=160m
> > >>> >   -XX:GCTimeRatio=19
> > >>> >   -XX:SurvivorRatio=2
> > >>> >   -XX:MaxTenuringThreshold=60"
> > >>> > _TAJO_MASTER_OPTS="$_TAJO_OPTS -Xmx512m -Xms512m -Xmn256m"
> > >>> > _TAJO_WORKER_OPTS="$_TAJO_OPTS -Xmx3g -Xms3g -Xmn1g"
> > >>> > _TAJO_QUERYMASTER_OPTS="$_TAJO_OPTS -Xmx512m -Xms512m -Xmn256m"
> > >>> > export TAJO_OPTS=$_TAJO_OPTS
> > >>> > export TAJO_MASTER_OPTS=$_TAJO_MASTER_OPTS
> > >>> > export TAJO_WORKER_OPTS=$_TAJO_WORKER_OPTS
> > >>> > export TAJO_QUERYMASTER_OPTS=$_TAJO_QUERYMASTER_OPTS
> > >>> > export TAJO_LOG_DIR=${TAJO_HOME}/logs
> > >>> > export TAJO_PID_DIR=${TAJO_HOME}/pids
> > >>> > export TAJO_WORKER_STANDBY_MODE=true
> > >>> >
> > >>> > *tajo-site.xml:*
> > >>> >
> > >>> > <configuration>
> > >>> >   <property>
> > >>> >     <name>tajo.rootdir</name>
> > >>> >     <value>hdfs://test-cluster/tajo</value>
> > >>> >   </property>
> > >>> >   <property>
> > >>> >     <name>tajo.master.umbilical-rpc.address</name>
> > >>> >     <value>10-0-86-51:26001</value>
> > >>> >   </property>
> > >>> >   <property>
> > >>> >     <name>tajo.master.client-rpc.address</name>
> > >>> >     <value>10-0-86-51:26002</value>
> > >>> >   </property>
> > >>> >   <property>
> > >>> >     <name>tajo.resource-tracker.rpc.address</name>
> > >>> >     <value>10-0-86-51:26003</value>
> > >>> >   </property>
> > >>> >   <property>
> > >>> >     <name>tajo.catalog.client-rpc.address</name>
> > >>> >     <value>10-0-86-51:26005</value>
> > >>> >   </property>
> > >>> >   <property>
> > >>> >     <name>tajo.worker.tmpdir.locations</name>
> > >>> >     <value>/test/tajo1,/test/tajo2,/test/tajo3</value>
> > >>> >   </property>
> > >>> >   <!--  worker  -->
> > >>> >   <property>
> > >>> >
>  <name>tajo.worker.resource.tajo.worker.resource.cpu-cores</name>
> > >>> >     <value>4</value>
> > >>> >   </property>
> > >>> >  <property>
> > >>> >    <name>tajo.worker.resource.memory-mb</name>
> > >>> >    <value>5120</value>
> > >>> >  </property>
> > >>> >   <property>
> > >>> >     <name>tajo.worker.resource.dfs-dir-aware</name>
> > >>> >     <value>true</value>
> > >>> >   </property>
> > >>> >   <property>
> > >>> >     <name>tajo.worker.resource.dedicated</name>
> > >>> >     <value>true</value>
> > >>> >   </property>
> > >>> >   <property>
> > >>> >     <name>tajo.worker.resource.dedicated-memory-ratio</name>
> > >>> >     <value>0.6</value>
> > >>> >   </property>
> > >>> > </configuration>
> > >>> >
> > >>> > On Fri, Jan 16, 2015 at 2:50 PM, Jinho Kim <jh...@apache.org>
> wrote:
> > >>> >
> > >>> > > Hello Azuyy yu
> > >>> > >
> > >>> > > I left some comments.
> > >>> > >
> > >>> > > -Jinho
> > >>> > > Best regards
> > >>> > >
> > >>> > > 2015-01-16 14:37 GMT+09:00 Azuryy Yu <az...@gmail.com>:
> > >>> > >
> > >>> > > > Hi,
> > >>> > > >
> > >>> > > > I tested Tajo before half a year, then not focus on Tajo
> because
> > >>> some
> > >>> > > other
> > >>> > > > works.
> > >>> > > >
> > >>> > > > then I setup a small dev Tajo cluster this week.(six nodes, VM)
> > >>> based
> > >>> > on
> > >>> > > > Hadoop-2.6.0.
> > >>> > > >
> > >>> > > > so my questions is:
> > >>> > > >
> > >>> > > > 1) From I know half a yea ago, Tajo is work on Yarn, using Yarn
> > >>> > scheduler
> > >>> > > > to manage  job resources. but now I found it doesn't rely on
> > Yarn,
> > >>> > > because
> > >>> > > > I only start HDFS daemons, no yarn daemons. so Tajo has his own
> > job
> > >>> > > > sheduler ?
> > >>> > > >
> > >>> > > >
> > >>> > > Now, tajo does using own task scheduler. and  You can start tajo
> > >>> without
> > >>> > > Yarn daemons
> > >>> > > Please refer to
> > http://tajo.apache.org/docs/0.9.0/configuration.html
> > >>> > >
> > >>> > >
> > >>> > > >
> > >>> > > > 2) Does that we need to put the file replications on every
> nodes
> > on
> > >>> > Tajo
> > >>> > > > cluster?
> > >>> > > >
> > >>> > >
> > >>> > > No, tajo does not need more replication.  if you set more
> > >>> replication,
> > >>> > data
> > >>> > > locality can be increased
> > >>> > >
> > >>> > > such as I have a six nodes Tajo cluster, then should I set HDFS
> > block
> > >>> > > > replication to six? because:
> > >>> > > >
> > >>> > > > I noticed when I run Tajo query, some nodes are busy, but some
> is
> > >>> free.
> > >>> > > > because the file's blocks are only located on these nodes. non
> > >>> others.
> > >>> > > >
> > >>> > > >
> > >>> > > In my opinion, you need to run balancer
> > >>> > >
> > >>> > >
> > >>> >
> > >>>
> >
> http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/HDFSCommands.html#balancer
> > >>> > >
> > >>> > >
> > >>> > > 3)the test data set is 4 million rows. nearly several GB. but
> it's
> > >>> very
> > >>> > > > slow when I runing: select count(distinct ID) from ****;
> > >>> > > > Any possible problems here?
> > >>> > > >
> > >>> > >
> > >>> > > Could you share tajo-env.sh, tajo-site.xml ?
> > >>> > >
> > >>> > >
> > >>> > > >
> > >>> > > >
> > >>> > > > Thanks
> > >>> > > >
> > >>> > >
> > >>> >
> > >>>
> > >>
> > >>
> > >
> >
>

Re: Some Tajo-0.9.0 questions

Posted by Jinho Kim <jh...@apache.org>.

 Thank you for your sharing

Can you enable the dfs.datanode.hdfs-blocks-metadata.enabled in
hdfs-site.xml ?
If you enable the block-metadata, tajo-cluster can use the volume load
balancing. You should restart the datanode and tajo cluster. I will
investigate performance of count-distinct operator. and You can change to
“select count(a.auid) from ( select auid from test_pl_00_0 ) a”


-Jinho
Best regards

2015-01-16 18:05 GMT+09:00 Azuryy Yu <az...@gmail.com>:

> default> select count(*) from test_pl_00_0;
> Progress: 0%, response time: 0.718 sec
> Progress: 0%, response time: 0.72 sec
> Progress: 0%, response time: 1.121 sec
> Progress: 12%, response time: 1.923 sec
> Progress: 28%, response time: 2.925 sec
> Progress: 41%, response time: 3.927 sec
> Progress: 50%, response time: 4.931 sec
> Progress: 100%, response time: 5.323 sec
> 2015-01-16T17:04:41.116+0800: [GC2015-01-16T17:04:41.116+0800: [ParNew:
> 26543K->6211K(31488K), 0.0079770 secs] 26543K->6211K(115456K), 0.0080700
> secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
> 2015-01-16T17:04:41.303+0800: [GC2015-01-16T17:04:41.303+0800: [ParNew:
> 27203K->7185K(31488K), 0.0066950 secs] 27203K->7185K(115456K), 0.0068130
> secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
> 2015-01-16T17:04:41.504+0800: [GC2015-01-16T17:04:41.504+0800: [ParNew:
> 28177K->5597K(31488K), 0.0091630 secs] 28177K->6523K(115456K), 0.0092430
> secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
> 2015-01-16T17:04:41.778+0800: [GC2015-01-16T17:04:41.778+0800: [ParNew:
> 26589K->6837K(31488K), 0.0067280 secs] 27515K->7764K(115456K), 0.0068160
> secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
> ?count
> -------------------------------
> 4487999
> (1 rows, 5.323 sec, 8 B selected)
>
> On Fri, Jan 16, 2015 at 5:03 PM, Azuryy Yu <az...@gmail.com> wrote:
>
> > Hi,
> > There is no big improvement, sometimes more slower than before. I also
> try
> > to increase worker's heap size and parallel, nothing improve.
> >
> > default> select count(distinct auid) from test_pl_00_0;
> > Progress: 0%, response time: 0.963 sec
> > Progress: 0%, response time: 0.964 sec
> > Progress: 0%, response time: 1.366 sec
> > Progress: 0%, response time: 2.168 sec
> > Progress: 0%, response time: 3.17 sec
> > Progress: 0%, response time: 4.172 sec
> > Progress: 16%, response time: 5.174 sec
> > Progress: 16%, response time: 6.176 sec
> > Progress: 16%, response time: 7.178 sec
> > Progress: 33%, response time: 8.18 sec
> > Progress: 50%, response time: 9.181 sec
> > Progress: 50%, response time: 10.183 sec
> > Progress: 50%, response time: 11.185 sec
> > Progress: 50%, response time: 12.187 sec
> > Progress: 66%, response time: 13.189 sec
> > Progress: 66%, response time: 14.19 sec
> > Progress: 100%, response time: 15.003 sec
> > 2015-01-16T17:00:56.410+0800: [GC2015-01-16T17:00:56.410+0800: [ParNew:
> > 26473K->6582K(31488K), 0.0105030 secs] 26473K->6582K(115456K), 0.0105720
> > secs] [Times: user=0.04 sys=0.00, real=0.01 secs]
> > 2015-01-16T17:00:56.593+0800: [GC2015-01-16T17:00:56.593+0800: [ParNew:
> > 27574K->6469K(31488K), 0.0086300 secs] 27574K->6469K(115456K), 0.0086940
> > secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
> > 2015-01-16T17:00:56.800+0800: [GC2015-01-16T17:00:56.800+0800: [ParNew:
> > 27461K->5664K(31488K), 0.0122560 secs] 27461K->6591K(115456K), 0.0123210
> > secs] [Times: user=0.02 sys=0.01, real=0.01 secs]
> > 2015-01-16T17:00:57.065+0800: [GC2015-01-16T17:00:57.065+0800: [ParNew:
> > 26656K->6906K(31488K), 0.0070520 secs] 27583K->7833K(115456K), 0.0071470
> > secs] [Times: user=0.03 sys=0.00, real=0.01 secs]
> > ?count
> > -------------------------------
> > 1222356
> > (1 rows, 15.003 sec, 8 B selected)
> >
> >
> > On Fri, Jan 16, 2015 at 4:09 PM, Azuryy Yu <az...@gmail.com> wrote:
> >
> >> Thanks Kim, I'll try and post back.
> >>
> >> On Fri, Jan 16, 2015 at 4:02 PM, Jinho Kim <jh...@apache.org> wrote:
> >>
> >>> Thanks Azuryy Yu
> >>>
> >>> Your parallel running tasks of tajo-worker is 10 but heap memory is
> 3GB.
> >>> It
> >>> cause a long JVM pause
> >>> I recommend following :
> >>>
> >>> tajo-env.sh
> >>> TAJO_WORKER_HEAPSIZE=3000 or more
> >>>
> >>> tajo-site.xml
> >>> <!--  worker  -->
> >>> <property>
> >>>   <name>tajo.worker.resource.memory-mb</name>
> >>>   <value>3512</value> <!--  3 tasks + 1 qm task  -->
> >>> </property>
> >>> <property>
> >>>   <name>tajo.task.memory-slot-mb.default</name>
> >>>   <value>1000</value> <!--  default 512 -->
> >>> </property>
> >>> <property>
> >>>    <name>tajo.worker.resource.dfs-dir-aware</name>
> >>>    <value>true</value>
> >>> </property>
> >>> <!--  end  -->
> >>>
> http://tajo.apache.org/docs/0.9.0/configuration/worker_configuration.html
> >>>
> >>> -Jinho
> >>> Best regards
> >>>
> >>> 2015-01-16 16:02 GMT+09:00 Azuryy Yu <az...@gmail.com>:
> >>>
> >>> > Thanks Kim.
> >>> >
> >>> > The following is my tajo-env and tajo-site
> >>> >
> >>> > *tajo-env.sh:*
> >>> > export HADOOP_HOME=/usr/local/hadoop
> >>> > export JAVA_HOME=/usr/local/java
> >>> > _TAJO_OPTS="-server -verbose:gc
> >>> >   -XX:+PrintGCDateStamps
> >>> >   -XX:+PrintGCDetails
> >>> >   -XX:+UseGCLogFileRotation
> >>> >   -XX:NumberOfGCLogFiles=9
> >>> >   -XX:GCLogFileSize=256m
> >>> >   -XX:+DisableExplicitGC
> >>> >   -XX:+UseCompressedOops
> >>> >   -XX:SoftRefLRUPolicyMSPerMB=0
> >>> >   -XX:+UseFastAccessorMethods
> >>> >   -XX:+UseParNewGC
> >>> >   -XX:+UseConcMarkSweepGC
> >>> >   -XX:+CMSParallelRemarkEnabled
> >>> >   -XX:CMSInitiatingOccupancyFraction=70
> >>> >   -XX:+UseCMSCompactAtFullCollection
> >>> >   -XX:CMSFullGCsBeforeCompaction=0
> >>> >   -XX:+CMSClassUnloadingEnabled
> >>> >   -XX:CMSMaxAbortablePrecleanTime=300
> >>> >   -XX:+CMSScavengeBeforeRemark
> >>> >   -XX:PermSize=160m
> >>> >   -XX:GCTimeRatio=19
> >>> >   -XX:SurvivorRatio=2
> >>> >   -XX:MaxTenuringThreshold=60"
> >>> > _TAJO_MASTER_OPTS="$_TAJO_OPTS -Xmx512m -Xms512m -Xmn256m"
> >>> > _TAJO_WORKER_OPTS="$_TAJO_OPTS -Xmx3g -Xms3g -Xmn1g"
> >>> > _TAJO_QUERYMASTER_OPTS="$_TAJO_OPTS -Xmx512m -Xms512m -Xmn256m"
> >>> > export TAJO_OPTS=$_TAJO_OPTS
> >>> > export TAJO_MASTER_OPTS=$_TAJO_MASTER_OPTS
> >>> > export TAJO_WORKER_OPTS=$_TAJO_WORKER_OPTS
> >>> > export TAJO_QUERYMASTER_OPTS=$_TAJO_QUERYMASTER_OPTS
> >>> > export TAJO_LOG_DIR=${TAJO_HOME}/logs
> >>> > export TAJO_PID_DIR=${TAJO_HOME}/pids
> >>> > export TAJO_WORKER_STANDBY_MODE=true
> >>> >
> >>> > *tajo-site.xml:*
> >>> >
> >>> > <configuration>
> >>> >   <property>
> >>> >     <name>tajo.rootdir</name>
> >>> >     <value>hdfs://test-cluster/tajo</value>
> >>> >   </property>
> >>> >   <property>
> >>> >     <name>tajo.master.umbilical-rpc.address</name>
> >>> >     <value>10-0-86-51:26001</value>
> >>> >   </property>
> >>> >   <property>
> >>> >     <name>tajo.master.client-rpc.address</name>
> >>> >     <value>10-0-86-51:26002</value>
> >>> >   </property>
> >>> >   <property>
> >>> >     <name>tajo.resource-tracker.rpc.address</name>
> >>> >     <value>10-0-86-51:26003</value>
> >>> >   </property>
> >>> >   <property>
> >>> >     <name>tajo.catalog.client-rpc.address</name>
> >>> >     <value>10-0-86-51:26005</value>
> >>> >   </property>
> >>> >   <property>
> >>> >     <name>tajo.worker.tmpdir.locations</name>
> >>> >     <value>/test/tajo1,/test/tajo2,/test/tajo3</value>
> >>> >   </property>
> >>> >   <!--  worker  -->
> >>> >   <property>
> >>> >     <name>tajo.worker.resource.tajo.worker.resource.cpu-cores</name>
> >>> >     <value>4</value>
> >>> >   </property>
> >>> >  <property>
> >>> >    <name>tajo.worker.resource.memory-mb</name>
> >>> >    <value>5120</value>
> >>> >  </property>
> >>> >   <property>
> >>> >     <name>tajo.worker.resource.dfs-dir-aware</name>
> >>> >     <value>true</value>
> >>> >   </property>
> >>> >   <property>
> >>> >     <name>tajo.worker.resource.dedicated</name>
> >>> >     <value>true</value>
> >>> >   </property>
> >>> >   <property>
> >>> >     <name>tajo.worker.resource.dedicated-memory-ratio</name>
> >>> >     <value>0.6</value>
> >>> >   </property>
> >>> > </configuration>
> >>> >
> >>> > On Fri, Jan 16, 2015 at 2:50 PM, Jinho Kim <jh...@apache.org> wrote:
> >>> >
> >>> > > Hello Azuyy yu
> >>> > >
> >>> > > I left some comments.
> >>> > >
> >>> > > -Jinho
> >>> > > Best regards
> >>> > >
> >>> > > 2015-01-16 14:37 GMT+09:00 Azuryy Yu <az...@gmail.com>:
> >>> > >
> >>> > > > Hi,
> >>> > > >
> >>> > > > I tested Tajo before half a year, then not focus on Tajo because
> >>> some
> >>> > > other
> >>> > > > works.
> >>> > > >
> >>> > > > then I setup a small dev Tajo cluster this week.(six nodes, VM)
> >>> based
> >>> > on
> >>> > > > Hadoop-2.6.0.
> >>> > > >
> >>> > > > so my questions is:
> >>> > > >
> >>> > > > 1) From I know half a yea ago, Tajo is work on Yarn, using Yarn
> >>> > scheduler
> >>> > > > to manage  job resources. but now I found it doesn't rely on
> Yarn,
> >>> > > because
> >>> > > > I only start HDFS daemons, no yarn daemons. so Tajo has his own
> job
> >>> > > > sheduler ?
> >>> > > >
> >>> > > >
> >>> > > Now, tajo does using own task scheduler. and  You can start tajo
> >>> without
> >>> > > Yarn daemons
> >>> > > Please refer to
> http://tajo.apache.org/docs/0.9.0/configuration.html
> >>> > >
> >>> > >
> >>> > > >
> >>> > > > 2) Does that we need to put the file replications on every nodes
> on
> >>> > Tajo
> >>> > > > cluster?
> >>> > > >
> >>> > >
> >>> > > No, tajo does not need more replication.  if you set more
> >>> replication,
> >>> > data
> >>> > > locality can be increased
> >>> > >
> >>> > > such as I have a six nodes Tajo cluster, then should I set HDFS
> block
> >>> > > > replication to six? because:
> >>> > > >
> >>> > > > I noticed when I run Tajo query, some nodes are busy, but some is
> >>> free.
> >>> > > > because the file's blocks are only located on these nodes. non
> >>> others.
> >>> > > >
> >>> > > >
> >>> > > In my opinion, you need to run balancer
> >>> > >
> >>> > >
> >>> >
> >>>
> http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/HDFSCommands.html#balancer
> >>> > >
> >>> > >
> >>> > > 3)the test data set is 4 million rows. nearly several GB. but it's
> >>> very
> >>> > > > slow when I runing: select count(distinct ID) from ****;
> >>> > > > Any possible problems here?
> >>> > > >
> >>> > >
> >>> > > Could you share tajo-env.sh, tajo-site.xml ?
> >>> > >
> >>> > >
> >>> > > >
> >>> > > >
> >>> > > > Thanks
> >>> > > >
> >>> > >
> >>> >
> >>>
> >>
> >>
> >
>

Re: Some Tajo-0.9.0 questions

Posted by Azuryy Yu <az...@gmail.com>.

default> select count(*) from test_pl_00_0;
Progress: 0%, response time: 0.718 sec
Progress: 0%, response time: 0.72 sec
Progress: 0%, response time: 1.121 sec
Progress: 12%, response time: 1.923 sec
Progress: 28%, response time: 2.925 sec
Progress: 41%, response time: 3.927 sec
Progress: 50%, response time: 4.931 sec
Progress: 100%, response time: 5.323 sec
2015-01-16T17:04:41.116+0800: [GC2015-01-16T17:04:41.116+0800: [ParNew:
26543K->6211K(31488K), 0.0079770 secs] 26543K->6211K(115456K), 0.0080700
secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
2015-01-16T17:04:41.303+0800: [GC2015-01-16T17:04:41.303+0800: [ParNew:
27203K->7185K(31488K), 0.0066950 secs] 27203K->7185K(115456K), 0.0068130
secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
2015-01-16T17:04:41.504+0800: [GC2015-01-16T17:04:41.504+0800: [ParNew:
28177K->5597K(31488K), 0.0091630 secs] 28177K->6523K(115456K), 0.0092430
secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
2015-01-16T17:04:41.778+0800: [GC2015-01-16T17:04:41.778+0800: [ParNew:
26589K->6837K(31488K), 0.0067280 secs] 27515K->7764K(115456K), 0.0068160
secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
?count
-------------------------------
4487999
(1 rows, 5.323 sec, 8 B selected)

On Fri, Jan 16, 2015 at 5:03 PM, Azuryy Yu <az...@gmail.com> wrote:

> Hi,
> There is no big improvement, sometimes more slower than before. I also try
> to increase worker's heap size and parallel, nothing improve.
>
> default> select count(distinct auid) from test_pl_00_0;
> Progress: 0%, response time: 0.963 sec
> Progress: 0%, response time: 0.964 sec
> Progress: 0%, response time: 1.366 sec
> Progress: 0%, response time: 2.168 sec
> Progress: 0%, response time: 3.17 sec
> Progress: 0%, response time: 4.172 sec
> Progress: 16%, response time: 5.174 sec
> Progress: 16%, response time: 6.176 sec
> Progress: 16%, response time: 7.178 sec
> Progress: 33%, response time: 8.18 sec
> Progress: 50%, response time: 9.181 sec
> Progress: 50%, response time: 10.183 sec
> Progress: 50%, response time: 11.185 sec
> Progress: 50%, response time: 12.187 sec
> Progress: 66%, response time: 13.189 sec
> Progress: 66%, response time: 14.19 sec
> Progress: 100%, response time: 15.003 sec
> 2015-01-16T17:00:56.410+0800: [GC2015-01-16T17:00:56.410+0800: [ParNew:
> 26473K->6582K(31488K), 0.0105030 secs] 26473K->6582K(115456K), 0.0105720
> secs] [Times: user=0.04 sys=0.00, real=0.01 secs]
> 2015-01-16T17:00:56.593+0800: [GC2015-01-16T17:00:56.593+0800: [ParNew:
> 27574K->6469K(31488K), 0.0086300 secs] 27574K->6469K(115456K), 0.0086940
> secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
> 2015-01-16T17:00:56.800+0800: [GC2015-01-16T17:00:56.800+0800: [ParNew:
> 27461K->5664K(31488K), 0.0122560 secs] 27461K->6591K(115456K), 0.0123210
> secs] [Times: user=0.02 sys=0.01, real=0.01 secs]
> 2015-01-16T17:00:57.065+0800: [GC2015-01-16T17:00:57.065+0800: [ParNew:
> 26656K->6906K(31488K), 0.0070520 secs] 27583K->7833K(115456K), 0.0071470
> secs] [Times: user=0.03 sys=0.00, real=0.01 secs]
> ?count
> -------------------------------
> 1222356
> (1 rows, 15.003 sec, 8 B selected)
>
>
> On Fri, Jan 16, 2015 at 4:09 PM, Azuryy Yu <az...@gmail.com> wrote:
>
>> Thanks Kim, I'll try and post back.
>>
>> On Fri, Jan 16, 2015 at 4:02 PM, Jinho Kim <jh...@apache.org> wrote:
>>
>>> Thanks Azuryy Yu
>>>
>>> Your parallel running tasks of tajo-worker is 10 but heap memory is 3GB.
>>> It
>>> cause a long JVM pause
>>> I recommend following :
>>>
>>> tajo-env.sh
>>> TAJO_WORKER_HEAPSIZE=3000 or more
>>>
>>> tajo-site.xml
>>> <!--  worker  -->
>>> <property>
>>>   <name>tajo.worker.resource.memory-mb</name>
>>>   <value>3512</value> <!--  3 tasks + 1 qm task  -->
>>> </property>
>>> <property>
>>>   <name>tajo.task.memory-slot-mb.default</name>
>>>   <value>1000</value> <!--  default 512 -->
>>> </property>
>>> <property>
>>>    <name>tajo.worker.resource.dfs-dir-aware</name>
>>>    <value>true</value>
>>> </property>
>>> <!--  end  -->
>>> http://tajo.apache.org/docs/0.9.0/configuration/worker_configuration.html
>>>
>>> -Jinho
>>> Best regards
>>>
>>> 2015-01-16 16:02 GMT+09:00 Azuryy Yu <az...@gmail.com>:
>>>
>>> > Thanks Kim.
>>> >
>>> > The following is my tajo-env and tajo-site
>>> >
>>> > *tajo-env.sh:*
>>> > export HADOOP_HOME=/usr/local/hadoop
>>> > export JAVA_HOME=/usr/local/java
>>> > _TAJO_OPTS="-server -verbose:gc
>>> >   -XX:+PrintGCDateStamps
>>> >   -XX:+PrintGCDetails
>>> >   -XX:+UseGCLogFileRotation
>>> >   -XX:NumberOfGCLogFiles=9
>>> >   -XX:GCLogFileSize=256m
>>> >   -XX:+DisableExplicitGC
>>> >   -XX:+UseCompressedOops
>>> >   -XX:SoftRefLRUPolicyMSPerMB=0
>>> >   -XX:+UseFastAccessorMethods
>>> >   -XX:+UseParNewGC
>>> >   -XX:+UseConcMarkSweepGC
>>> >   -XX:+CMSParallelRemarkEnabled
>>> >   -XX:CMSInitiatingOccupancyFraction=70
>>> >   -XX:+UseCMSCompactAtFullCollection
>>> >   -XX:CMSFullGCsBeforeCompaction=0
>>> >   -XX:+CMSClassUnloadingEnabled
>>> >   -XX:CMSMaxAbortablePrecleanTime=300
>>> >   -XX:+CMSScavengeBeforeRemark
>>> >   -XX:PermSize=160m
>>> >   -XX:GCTimeRatio=19
>>> >   -XX:SurvivorRatio=2
>>> >   -XX:MaxTenuringThreshold=60"
>>> > _TAJO_MASTER_OPTS="$_TAJO_OPTS -Xmx512m -Xms512m -Xmn256m"
>>> > _TAJO_WORKER_OPTS="$_TAJO_OPTS -Xmx3g -Xms3g -Xmn1g"
>>> > _TAJO_QUERYMASTER_OPTS="$_TAJO_OPTS -Xmx512m -Xms512m -Xmn256m"
>>> > export TAJO_OPTS=$_TAJO_OPTS
>>> > export TAJO_MASTER_OPTS=$_TAJO_MASTER_OPTS
>>> > export TAJO_WORKER_OPTS=$_TAJO_WORKER_OPTS
>>> > export TAJO_QUERYMASTER_OPTS=$_TAJO_QUERYMASTER_OPTS
>>> > export TAJO_LOG_DIR=${TAJO_HOME}/logs
>>> > export TAJO_PID_DIR=${TAJO_HOME}/pids
>>> > export TAJO_WORKER_STANDBY_MODE=true
>>> >
>>> > *tajo-site.xml:*
>>> >
>>> > <configuration>
>>> >   <property>
>>> >     <name>tajo.rootdir</name>
>>> >     <value>hdfs://test-cluster/tajo</value>
>>> >   </property>
>>> >   <property>
>>> >     <name>tajo.master.umbilical-rpc.address</name>
>>> >     <value>10-0-86-51:26001</value>
>>> >   </property>
>>> >   <property>
>>> >     <name>tajo.master.client-rpc.address</name>
>>> >     <value>10-0-86-51:26002</value>
>>> >   </property>
>>> >   <property>
>>> >     <name>tajo.resource-tracker.rpc.address</name>
>>> >     <value>10-0-86-51:26003</value>
>>> >   </property>
>>> >   <property>
>>> >     <name>tajo.catalog.client-rpc.address</name>
>>> >     <value>10-0-86-51:26005</value>
>>> >   </property>
>>> >   <property>
>>> >     <name>tajo.worker.tmpdir.locations</name>
>>> >     <value>/test/tajo1,/test/tajo2,/test/tajo3</value>
>>> >   </property>
>>> >   <!--  worker  -->
>>> >   <property>
>>> >     <name>tajo.worker.resource.tajo.worker.resource.cpu-cores</name>
>>> >     <value>4</value>
>>> >   </property>
>>> >  <property>
>>> >    <name>tajo.worker.resource.memory-mb</name>
>>> >    <value>5120</value>
>>> >  </property>
>>> >   <property>
>>> >     <name>tajo.worker.resource.dfs-dir-aware</name>
>>> >     <value>true</value>
>>> >   </property>
>>> >   <property>
>>> >     <name>tajo.worker.resource.dedicated</name>
>>> >     <value>true</value>
>>> >   </property>
>>> >   <property>
>>> >     <name>tajo.worker.resource.dedicated-memory-ratio</name>
>>> >     <value>0.6</value>
>>> >   </property>
>>> > </configuration>
>>> >
>>> > On Fri, Jan 16, 2015 at 2:50 PM, Jinho Kim <jh...@apache.org> wrote:
>>> >
>>> > > Hello Azuyy yu
>>> > >
>>> > > I left some comments.
>>> > >
>>> > > -Jinho
>>> > > Best regards
>>> > >
>>> > > 2015-01-16 14:37 GMT+09:00 Azuryy Yu <az...@gmail.com>:
>>> > >
>>> > > > Hi,
>>> > > >
>>> > > > I tested Tajo before half a year, then not focus on Tajo because
>>> some
>>> > > other
>>> > > > works.
>>> > > >
>>> > > > then I setup a small dev Tajo cluster this week.(six nodes, VM)
>>> based
>>> > on
>>> > > > Hadoop-2.6.0.
>>> > > >
>>> > > > so my questions is:
>>> > > >
>>> > > > 1) From I know half a yea ago, Tajo is work on Yarn, using Yarn
>>> > scheduler
>>> > > > to manage  job resources. but now I found it doesn't rely on Yarn,
>>> > > because
>>> > > > I only start HDFS daemons, no yarn daemons. so Tajo has his own job
>>> > > > sheduler ?
>>> > > >
>>> > > >
>>> > > Now, tajo does using own task scheduler. and  You can start tajo
>>> without
>>> > > Yarn daemons
>>> > > Please refer to http://tajo.apache.org/docs/0.9.0/configuration.html
>>> > >
>>> > >
>>> > > >
>>> > > > 2) Does that we need to put the file replications on every nodes on
>>> > Tajo
>>> > > > cluster?
>>> > > >
>>> > >
>>> > > No, tajo does not need more replication.  if you set more
>>> replication,
>>> > data
>>> > > locality can be increased
>>> > >
>>> > > such as I have a six nodes Tajo cluster, then should I set HDFS block
>>> > > > replication to six? because:
>>> > > >
>>> > > > I noticed when I run Tajo query, some nodes are busy, but some is
>>> free.
>>> > > > because the file's blocks are only located on these nodes. non
>>> others.
>>> > > >
>>> > > >
>>> > > In my opinion, you need to run balancer
>>> > >
>>> > >
>>> >
>>> http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/HDFSCommands.html#balancer
>>> > >
>>> > >
>>> > > 3)the test data set is 4 million rows. nearly several GB. but it's
>>> very
>>> > > > slow when I runing: select count(distinct ID) from ****;
>>> > > > Any possible problems here?
>>> > > >
>>> > >
>>> > > Could you share tajo-env.sh, tajo-site.xml ?
>>> > >
>>> > >
>>> > > >
>>> > > >
>>> > > > Thanks
>>> > > >
>>> > >
>>> >
>>>
>>
>>
>

Re: Some Tajo-0.9.0 questions

Posted by Azuryy Yu <az...@gmail.com>.

Hi,
There is no big improvement, sometimes more slower than before. I also try
to increase worker's heap size and parallel, nothing improve.

default> select count(distinct auid) from test_pl_00_0;
Progress: 0%, response time: 0.963 sec
Progress: 0%, response time: 0.964 sec
Progress: 0%, response time: 1.366 sec
Progress: 0%, response time: 2.168 sec
Progress: 0%, response time: 3.17 sec
Progress: 0%, response time: 4.172 sec
Progress: 16%, response time: 5.174 sec
Progress: 16%, response time: 6.176 sec
Progress: 16%, response time: 7.178 sec
Progress: 33%, response time: 8.18 sec
Progress: 50%, response time: 9.181 sec
Progress: 50%, response time: 10.183 sec
Progress: 50%, response time: 11.185 sec
Progress: 50%, response time: 12.187 sec
Progress: 66%, response time: 13.189 sec
Progress: 66%, response time: 14.19 sec
Progress: 100%, response time: 15.003 sec
2015-01-16T17:00:56.410+0800: [GC2015-01-16T17:00:56.410+0800: [ParNew:
26473K->6582K(31488K), 0.0105030 secs] 26473K->6582K(115456K), 0.0105720
secs] [Times: user=0.04 sys=0.00, real=0.01 secs]
2015-01-16T17:00:56.593+0800: [GC2015-01-16T17:00:56.593+0800: [ParNew:
27574K->6469K(31488K), 0.0086300 secs] 27574K->6469K(115456K), 0.0086940
secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
2015-01-16T17:00:56.800+0800: [GC2015-01-16T17:00:56.800+0800: [ParNew:
27461K->5664K(31488K), 0.0122560 secs] 27461K->6591K(115456K), 0.0123210
secs] [Times: user=0.02 sys=0.01, real=0.01 secs]
2015-01-16T17:00:57.065+0800: [GC2015-01-16T17:00:57.065+0800: [ParNew:
26656K->6906K(31488K), 0.0070520 secs] 27583K->7833K(115456K), 0.0071470
secs] [Times: user=0.03 sys=0.00, real=0.01 secs]
?count
-------------------------------
1222356
(1 rows, 15.003 sec, 8 B selected)


On Fri, Jan 16, 2015 at 4:09 PM, Azuryy Yu <az...@gmail.com> wrote:

> Thanks Kim, I'll try and post back.
>
> On Fri, Jan 16, 2015 at 4:02 PM, Jinho Kim <jh...@apache.org> wrote:
>
>> Thanks Azuryy Yu
>>
>> Your parallel running tasks of tajo-worker is 10 but heap memory is 3GB.
>> It
>> cause a long JVM pause
>> I recommend following :
>>
>> tajo-env.sh
>> TAJO_WORKER_HEAPSIZE=3000 or more
>>
>> tajo-site.xml
>> <!--  worker  -->
>> <property>
>>   <name>tajo.worker.resource.memory-mb</name>
>>   <value>3512</value> <!--  3 tasks + 1 qm task  -->
>> </property>
>> <property>
>>   <name>tajo.task.memory-slot-mb.default</name>
>>   <value>1000</value> <!--  default 512 -->
>> </property>
>> <property>
>>    <name>tajo.worker.resource.dfs-dir-aware</name>
>>    <value>true</value>
>> </property>
>> <!--  end  -->
>> http://tajo.apache.org/docs/0.9.0/configuration/worker_configuration.html
>>
>> -Jinho
>> Best regards
>>
>> 2015-01-16 16:02 GMT+09:00 Azuryy Yu <az...@gmail.com>:
>>
>> > Thanks Kim.
>> >
>> > The following is my tajo-env and tajo-site
>> >
>> > *tajo-env.sh:*
>> > export HADOOP_HOME=/usr/local/hadoop
>> > export JAVA_HOME=/usr/local/java
>> > _TAJO_OPTS="-server -verbose:gc
>> >   -XX:+PrintGCDateStamps
>> >   -XX:+PrintGCDetails
>> >   -XX:+UseGCLogFileRotation
>> >   -XX:NumberOfGCLogFiles=9
>> >   -XX:GCLogFileSize=256m
>> >   -XX:+DisableExplicitGC
>> >   -XX:+UseCompressedOops
>> >   -XX:SoftRefLRUPolicyMSPerMB=0
>> >   -XX:+UseFastAccessorMethods
>> >   -XX:+UseParNewGC
>> >   -XX:+UseConcMarkSweepGC
>> >   -XX:+CMSParallelRemarkEnabled
>> >   -XX:CMSInitiatingOccupancyFraction=70
>> >   -XX:+UseCMSCompactAtFullCollection
>> >   -XX:CMSFullGCsBeforeCompaction=0
>> >   -XX:+CMSClassUnloadingEnabled
>> >   -XX:CMSMaxAbortablePrecleanTime=300
>> >   -XX:+CMSScavengeBeforeRemark
>> >   -XX:PermSize=160m
>> >   -XX:GCTimeRatio=19
>> >   -XX:SurvivorRatio=2
>> >   -XX:MaxTenuringThreshold=60"
>> > _TAJO_MASTER_OPTS="$_TAJO_OPTS -Xmx512m -Xms512m -Xmn256m"
>> > _TAJO_WORKER_OPTS="$_TAJO_OPTS -Xmx3g -Xms3g -Xmn1g"
>> > _TAJO_QUERYMASTER_OPTS="$_TAJO_OPTS -Xmx512m -Xms512m -Xmn256m"
>> > export TAJO_OPTS=$_TAJO_OPTS
>> > export TAJO_MASTER_OPTS=$_TAJO_MASTER_OPTS
>> > export TAJO_WORKER_OPTS=$_TAJO_WORKER_OPTS
>> > export TAJO_QUERYMASTER_OPTS=$_TAJO_QUERYMASTER_OPTS
>> > export TAJO_LOG_DIR=${TAJO_HOME}/logs
>> > export TAJO_PID_DIR=${TAJO_HOME}/pids
>> > export TAJO_WORKER_STANDBY_MODE=true
>> >
>> > *tajo-site.xml:*
>> >
>> > <configuration>
>> >   <property>
>> >     <name>tajo.rootdir</name>
>> >     <value>hdfs://test-cluster/tajo</value>
>> >   </property>
>> >   <property>
>> >     <name>tajo.master.umbilical-rpc.address</name>
>> >     <value>10-0-86-51:26001</value>
>> >   </property>
>> >   <property>
>> >     <name>tajo.master.client-rpc.address</name>
>> >     <value>10-0-86-51:26002</value>
>> >   </property>
>> >   <property>
>> >     <name>tajo.resource-tracker.rpc.address</name>
>> >     <value>10-0-86-51:26003</value>
>> >   </property>
>> >   <property>
>> >     <name>tajo.catalog.client-rpc.address</name>
>> >     <value>10-0-86-51:26005</value>
>> >   </property>
>> >   <property>
>> >     <name>tajo.worker.tmpdir.locations</name>
>> >     <value>/test/tajo1,/test/tajo2,/test/tajo3</value>
>> >   </property>
>> >   <!--  worker  -->
>> >   <property>
>> >     <name>tajo.worker.resource.tajo.worker.resource.cpu-cores</name>
>> >     <value>4</value>
>> >   </property>
>> >  <property>
>> >    <name>tajo.worker.resource.memory-mb</name>
>> >    <value>5120</value>
>> >  </property>
>> >   <property>
>> >     <name>tajo.worker.resource.dfs-dir-aware</name>
>> >     <value>true</value>
>> >   </property>
>> >   <property>
>> >     <name>tajo.worker.resource.dedicated</name>
>> >     <value>true</value>
>> >   </property>
>> >   <property>
>> >     <name>tajo.worker.resource.dedicated-memory-ratio</name>
>> >     <value>0.6</value>
>> >   </property>
>> > </configuration>
>> >
>> > On Fri, Jan 16, 2015 at 2:50 PM, Jinho Kim <jh...@apache.org> wrote:
>> >
>> > > Hello Azuyy yu
>> > >
>> > > I left some comments.
>> > >
>> > > -Jinho
>> > > Best regards
>> > >
>> > > 2015-01-16 14:37 GMT+09:00 Azuryy Yu <az...@gmail.com>:
>> > >
>> > > > Hi,
>> > > >
>> > > > I tested Tajo before half a year, then not focus on Tajo because
>> some
>> > > other
>> > > > works.
>> > > >
>> > > > then I setup a small dev Tajo cluster this week.(six nodes, VM)
>> based
>> > on
>> > > > Hadoop-2.6.0.
>> > > >
>> > > > so my questions is:
>> > > >
>> > > > 1) From I know half a yea ago, Tajo is work on Yarn, using Yarn
>> > scheduler
>> > > > to manage  job resources. but now I found it doesn't rely on Yarn,
>> > > because
>> > > > I only start HDFS daemons, no yarn daemons. so Tajo has his own job
>> > > > sheduler ?
>> > > >
>> > > >
>> > > Now, tajo does using own task scheduler. and  You can start tajo
>> without
>> > > Yarn daemons
>> > > Please refer to http://tajo.apache.org/docs/0.9.0/configuration.html
>> > >
>> > >
>> > > >
>> > > > 2) Does that we need to put the file replications on every nodes on
>> > Tajo
>> > > > cluster?
>> > > >
>> > >
>> > > No, tajo does not need more replication.  if you set more replication,
>> > data
>> > > locality can be increased
>> > >
>> > > such as I have a six nodes Tajo cluster, then should I set HDFS block
>> > > > replication to six? because:
>> > > >
>> > > > I noticed when I run Tajo query, some nodes are busy, but some is
>> free.
>> > > > because the file's blocks are only located on these nodes. non
>> others.
>> > > >
>> > > >
>> > > In my opinion, you need to run balancer
>> > >
>> > >
>> >
>> http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/HDFSCommands.html#balancer
>> > >
>> > >
>> > > 3)the test data set is 4 million rows. nearly several GB. but it's
>> very
>> > > > slow when I runing: select count(distinct ID) from ****;
>> > > > Any possible problems here?
>> > > >
>> > >
>> > > Could you share tajo-env.sh, tajo-site.xml ?
>> > >
>> > >
>> > > >
>> > > >
>> > > > Thanks
>> > > >
>> > >
>> >
>>
>
>

Re: Some Tajo-0.9.0 questions

Posted by Azuryy Yu <az...@gmail.com>.

Thanks Kim, I'll try and post back.

On Fri, Jan 16, 2015 at 4:02 PM, Jinho Kim <jh...@apache.org> wrote:

> Thanks Azuryy Yu
>
> Your parallel running tasks of tajo-worker is 10 but heap memory is 3GB. It
> cause a long JVM pause
> I recommend following :
>
> tajo-env.sh
> TAJO_WORKER_HEAPSIZE=3000 or more
>
> tajo-site.xml
> <!--  worker  -->
> <property>
>   <name>tajo.worker.resource.memory-mb</name>
>   <value>3512</value> <!--  3 tasks + 1 qm task  -->
> </property>
> <property>
>   <name>tajo.task.memory-slot-mb.default</name>
>   <value>1000</value> <!--  default 512 -->
> </property>
> <property>
>    <name>tajo.worker.resource.dfs-dir-aware</name>
>    <value>true</value>
> </property>
> <!--  end  -->
> http://tajo.apache.org/docs/0.9.0/configuration/worker_configuration.html
>
> -Jinho
> Best regards
>
> 2015-01-16 16:02 GMT+09:00 Azuryy Yu <az...@gmail.com>:
>
> > Thanks Kim.
> >
> > The following is my tajo-env and tajo-site
> >
> > *tajo-env.sh:*
> > export HADOOP_HOME=/usr/local/hadoop
> > export JAVA_HOME=/usr/local/java
> > _TAJO_OPTS="-server -verbose:gc
> >   -XX:+PrintGCDateStamps
> >   -XX:+PrintGCDetails
> >   -XX:+UseGCLogFileRotation
> >   -XX:NumberOfGCLogFiles=9
> >   -XX:GCLogFileSize=256m
> >   -XX:+DisableExplicitGC
> >   -XX:+UseCompressedOops
> >   -XX:SoftRefLRUPolicyMSPerMB=0
> >   -XX:+UseFastAccessorMethods
> >   -XX:+UseParNewGC
> >   -XX:+UseConcMarkSweepGC
> >   -XX:+CMSParallelRemarkEnabled
> >   -XX:CMSInitiatingOccupancyFraction=70
> >   -XX:+UseCMSCompactAtFullCollection
> >   -XX:CMSFullGCsBeforeCompaction=0
> >   -XX:+CMSClassUnloadingEnabled
> >   -XX:CMSMaxAbortablePrecleanTime=300
> >   -XX:+CMSScavengeBeforeRemark
> >   -XX:PermSize=160m
> >   -XX:GCTimeRatio=19
> >   -XX:SurvivorRatio=2
> >   -XX:MaxTenuringThreshold=60"
> > _TAJO_MASTER_OPTS="$_TAJO_OPTS -Xmx512m -Xms512m -Xmn256m"
> > _TAJO_WORKER_OPTS="$_TAJO_OPTS -Xmx3g -Xms3g -Xmn1g"
> > _TAJO_QUERYMASTER_OPTS="$_TAJO_OPTS -Xmx512m -Xms512m -Xmn256m"
> > export TAJO_OPTS=$_TAJO_OPTS
> > export TAJO_MASTER_OPTS=$_TAJO_MASTER_OPTS
> > export TAJO_WORKER_OPTS=$_TAJO_WORKER_OPTS
> > export TAJO_QUERYMASTER_OPTS=$_TAJO_QUERYMASTER_OPTS
> > export TAJO_LOG_DIR=${TAJO_HOME}/logs
> > export TAJO_PID_DIR=${TAJO_HOME}/pids
> > export TAJO_WORKER_STANDBY_MODE=true
> >
> > *tajo-site.xml:*
> >
> > <configuration>
> >   <property>
> >     <name>tajo.rootdir</name>
> >     <value>hdfs://test-cluster/tajo</value>
> >   </property>
> >   <property>
> >     <name>tajo.master.umbilical-rpc.address</name>
> >     <value>10-0-86-51:26001</value>
> >   </property>
> >   <property>
> >     <name>tajo.master.client-rpc.address</name>
> >     <value>10-0-86-51:26002</value>
> >   </property>
> >   <property>
> >     <name>tajo.resource-tracker.rpc.address</name>
> >     <value>10-0-86-51:26003</value>
> >   </property>
> >   <property>
> >     <name>tajo.catalog.client-rpc.address</name>
> >     <value>10-0-86-51:26005</value>
> >   </property>
> >   <property>
> >     <name>tajo.worker.tmpdir.locations</name>
> >     <value>/test/tajo1,/test/tajo2,/test/tajo3</value>
> >   </property>
> >   <!--  worker  -->
> >   <property>
> >     <name>tajo.worker.resource.tajo.worker.resource.cpu-cores</name>
> >     <value>4</value>
> >   </property>
> >  <property>
> >    <name>tajo.worker.resource.memory-mb</name>
> >    <value>5120</value>
> >  </property>
> >   <property>
> >     <name>tajo.worker.resource.dfs-dir-aware</name>
> >     <value>true</value>
> >   </property>
> >   <property>
> >     <name>tajo.worker.resource.dedicated</name>
> >     <value>true</value>
> >   </property>
> >   <property>
> >     <name>tajo.worker.resource.dedicated-memory-ratio</name>
> >     <value>0.6</value>
> >   </property>
> > </configuration>
> >
> > On Fri, Jan 16, 2015 at 2:50 PM, Jinho Kim <jh...@apache.org> wrote:
> >
> > > Hello Azuyy yu
> > >
> > > I left some comments.
> > >
> > > -Jinho
> > > Best regards
> > >
> > > 2015-01-16 14:37 GMT+09:00 Azuryy Yu <az...@gmail.com>:
> > >
> > > > Hi,
> > > >
> > > > I tested Tajo before half a year, then not focus on Tajo because some
> > > other
> > > > works.
> > > >
> > > > then I setup a small dev Tajo cluster this week.(six nodes, VM) based
> > on
> > > > Hadoop-2.6.0.
> > > >
> > > > so my questions is:
> > > >
> > > > 1) From I know half a yea ago, Tajo is work on Yarn, using Yarn
> > scheduler
> > > > to manage  job resources. but now I found it doesn't rely on Yarn,
> > > because
> > > > I only start HDFS daemons, no yarn daemons. so Tajo has his own job
> > > > sheduler ?
> > > >
> > > >
> > > Now, tajo does using own task scheduler. and  You can start tajo
> without
> > > Yarn daemons
> > > Please refer to http://tajo.apache.org/docs/0.9.0/configuration.html
> > >
> > >
> > > >
> > > > 2) Does that we need to put the file replications on every nodes on
> > Tajo
> > > > cluster?
> > > >
> > >
> > > No, tajo does not need more replication.  if you set more replication,
> > data
> > > locality can be increased
> > >
> > > such as I have a six nodes Tajo cluster, then should I set HDFS block
> > > > replication to six? because:
> > > >
> > > > I noticed when I run Tajo query, some nodes are busy, but some is
> free.
> > > > because the file's blocks are only located on these nodes. non
> others.
> > > >
> > > >
> > > In my opinion, you need to run balancer
> > >
> > >
> >
> http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/HDFSCommands.html#balancer
> > >
> > >
> > > 3)the test data set is 4 million rows. nearly several GB. but it's very
> > > > slow when I runing: select count(distinct ID) from ****;
> > > > Any possible problems here?
> > > >
> > >
> > > Could you share tajo-env.sh, tajo-site.xml ?
> > >
> > >
> > > >
> > > >
> > > > Thanks
> > > >
> > >
> >
>

Re: Some Tajo-0.9.0 questions

Posted by Jinho Kim <jh...@apache.org>.

Thanks Azuryy Yu

Your parallel running tasks of tajo-worker is 10 but heap memory is 3GB. It
cause a long JVM pause
I recommend following :

tajo-env.sh
TAJO_WORKER_HEAPSIZE=3000 or more

tajo-site.xml
<!--  worker  -->
<property>
  <name>tajo.worker.resource.memory-mb</name>
  <value>3512</value> <!--  3 tasks + 1 qm task  -->
</property>
<property>
  <name>tajo.task.memory-slot-mb.default</name>
  <value>1000</value> <!--  default 512 -->
</property>
<property>
   <name>tajo.worker.resource.dfs-dir-aware</name>
   <value>true</value>
</property>
<!--  end  -->
http://tajo.apache.org/docs/0.9.0/configuration/worker_configuration.html

-Jinho
Best regards

2015-01-16 16:02 GMT+09:00 Azuryy Yu <az...@gmail.com>:

> Thanks Kim.
>
> The following is my tajo-env and tajo-site
>
> *tajo-env.sh:*
> export HADOOP_HOME=/usr/local/hadoop
> export JAVA_HOME=/usr/local/java
> _TAJO_OPTS="-server -verbose:gc
>   -XX:+PrintGCDateStamps
>   -XX:+PrintGCDetails
>   -XX:+UseGCLogFileRotation
>   -XX:NumberOfGCLogFiles=9
>   -XX:GCLogFileSize=256m
>   -XX:+DisableExplicitGC
>   -XX:+UseCompressedOops
>   -XX:SoftRefLRUPolicyMSPerMB=0
>   -XX:+UseFastAccessorMethods
>   -XX:+UseParNewGC
>   -XX:+UseConcMarkSweepGC
>   -XX:+CMSParallelRemarkEnabled
>   -XX:CMSInitiatingOccupancyFraction=70
>   -XX:+UseCMSCompactAtFullCollection
>   -XX:CMSFullGCsBeforeCompaction=0
>   -XX:+CMSClassUnloadingEnabled
>   -XX:CMSMaxAbortablePrecleanTime=300
>   -XX:+CMSScavengeBeforeRemark
>   -XX:PermSize=160m
>   -XX:GCTimeRatio=19
>   -XX:SurvivorRatio=2
>   -XX:MaxTenuringThreshold=60"
> _TAJO_MASTER_OPTS="$_TAJO_OPTS -Xmx512m -Xms512m -Xmn256m"
> _TAJO_WORKER_OPTS="$_TAJO_OPTS -Xmx3g -Xms3g -Xmn1g"
> _TAJO_QUERYMASTER_OPTS="$_TAJO_OPTS -Xmx512m -Xms512m -Xmn256m"
> export TAJO_OPTS=$_TAJO_OPTS
> export TAJO_MASTER_OPTS=$_TAJO_MASTER_OPTS
> export TAJO_WORKER_OPTS=$_TAJO_WORKER_OPTS
> export TAJO_QUERYMASTER_OPTS=$_TAJO_QUERYMASTER_OPTS
> export TAJO_LOG_DIR=${TAJO_HOME}/logs
> export TAJO_PID_DIR=${TAJO_HOME}/pids
> export TAJO_WORKER_STANDBY_MODE=true
>
> *tajo-site.xml:*
>
> <configuration>
>   <property>
>     <name>tajo.rootdir</name>
>     <value>hdfs://test-cluster/tajo</value>
>   </property>
>   <property>
>     <name>tajo.master.umbilical-rpc.address</name>
>     <value>10-0-86-51:26001</value>
>   </property>
>   <property>
>     <name>tajo.master.client-rpc.address</name>
>     <value>10-0-86-51:26002</value>
>   </property>
>   <property>
>     <name>tajo.resource-tracker.rpc.address</name>
>     <value>10-0-86-51:26003</value>
>   </property>
>   <property>
>     <name>tajo.catalog.client-rpc.address</name>
>     <value>10-0-86-51:26005</value>
>   </property>
>   <property>
>     <name>tajo.worker.tmpdir.locations</name>
>     <value>/test/tajo1,/test/tajo2,/test/tajo3</value>
>   </property>
>   <!--  worker  -->
>   <property>
>     <name>tajo.worker.resource.tajo.worker.resource.cpu-cores</name>
>     <value>4</value>
>   </property>
>  <property>
>    <name>tajo.worker.resource.memory-mb</name>
>    <value>5120</value>
>  </property>
>   <property>
>     <name>tajo.worker.resource.dfs-dir-aware</name>
>     <value>true</value>
>   </property>
>   <property>
>     <name>tajo.worker.resource.dedicated</name>
>     <value>true</value>
>   </property>
>   <property>
>     <name>tajo.worker.resource.dedicated-memory-ratio</name>
>     <value>0.6</value>
>   </property>
> </configuration>
>
> On Fri, Jan 16, 2015 at 2:50 PM, Jinho Kim <jh...@apache.org> wrote:
>
> > Hello Azuyy yu
> >
> > I left some comments.
> >
> > -Jinho
> > Best regards
> >
> > 2015-01-16 14:37 GMT+09:00 Azuryy Yu <az...@gmail.com>:
> >
> > > Hi,
> > >
> > > I tested Tajo before half a year, then not focus on Tajo because some
> > other
> > > works.
> > >
> > > then I setup a small dev Tajo cluster this week.(six nodes, VM) based
> on
> > > Hadoop-2.6.0.
> > >
> > > so my questions is:
> > >
> > > 1) From I know half a yea ago, Tajo is work on Yarn, using Yarn
> scheduler
> > > to manage  job resources. but now I found it doesn't rely on Yarn,
> > because
> > > I only start HDFS daemons, no yarn daemons. so Tajo has his own job
> > > sheduler ?
> > >
> > >
> > Now, tajo does using own task scheduler. and  You can start tajo without
> > Yarn daemons
> > Please refer to http://tajo.apache.org/docs/0.9.0/configuration.html
> >
> >
> > >
> > > 2) Does that we need to put the file replications on every nodes on
> Tajo
> > > cluster?
> > >
> >
> > No, tajo does not need more replication.  if you set more replication,
> data
> > locality can be increased
> >
> > such as I have a six nodes Tajo cluster, then should I set HDFS block
> > > replication to six? because:
> > >
> > > I noticed when I run Tajo query, some nodes are busy, but some is free.
> > > because the file's blocks are only located on these nodes. non others.
> > >
> > >
> > In my opinion, you need to run balancer
> >
> >
> http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/HDFSCommands.html#balancer
> >
> >
> > 3)the test data set is 4 million rows. nearly several GB. but it's very
> > > slow when I runing: select count(distinct ID) from ****;
> > > Any possible problems here?
> > >
> >
> > Could you share tajo-env.sh, tajo-site.xml ?
> >
> >
> > >
> > >
> > > Thanks
> > >
> >
>

Re: Some Tajo-0.9.0 questions

Posted by Azuryy Yu <az...@gmail.com>.

Thanks Kim.

The following is my tajo-env and tajo-site

*tajo-env.sh:*
export HADOOP_HOME=/usr/local/hadoop
export JAVA_HOME=/usr/local/java
_TAJO_OPTS="-server -verbose:gc
  -XX:+PrintGCDateStamps
  -XX:+PrintGCDetails
  -XX:+UseGCLogFileRotation
  -XX:NumberOfGCLogFiles=9
  -XX:GCLogFileSize=256m
  -XX:+DisableExplicitGC
  -XX:+UseCompressedOops
  -XX:SoftRefLRUPolicyMSPerMB=0
  -XX:+UseFastAccessorMethods
  -XX:+UseParNewGC
  -XX:+UseConcMarkSweepGC
  -XX:+CMSParallelRemarkEnabled
  -XX:CMSInitiatingOccupancyFraction=70
  -XX:+UseCMSCompactAtFullCollection
  -XX:CMSFullGCsBeforeCompaction=0
  -XX:+CMSClassUnloadingEnabled
  -XX:CMSMaxAbortablePrecleanTime=300
  -XX:+CMSScavengeBeforeRemark
  -XX:PermSize=160m
  -XX:GCTimeRatio=19
  -XX:SurvivorRatio=2
  -XX:MaxTenuringThreshold=60"
_TAJO_MASTER_OPTS="$_TAJO_OPTS -Xmx512m -Xms512m -Xmn256m"
_TAJO_WORKER_OPTS="$_TAJO_OPTS -Xmx3g -Xms3g -Xmn1g"
_TAJO_QUERYMASTER_OPTS="$_TAJO_OPTS -Xmx512m -Xms512m -Xmn256m"
export TAJO_OPTS=$_TAJO_OPTS
export TAJO_MASTER_OPTS=$_TAJO_MASTER_OPTS
export TAJO_WORKER_OPTS=$_TAJO_WORKER_OPTS
export TAJO_QUERYMASTER_OPTS=$_TAJO_QUERYMASTER_OPTS
export TAJO_LOG_DIR=${TAJO_HOME}/logs
export TAJO_PID_DIR=${TAJO_HOME}/pids
export TAJO_WORKER_STANDBY_MODE=true

*tajo-site.xml:*

<configuration>
  <property>
    <name>tajo.rootdir</name>
    <value>hdfs://test-cluster/tajo</value>
  </property>
  <property>
    <name>tajo.master.umbilical-rpc.address</name>
    <value>10-0-86-51:26001</value>
  </property>
  <property>
    <name>tajo.master.client-rpc.address</name>
    <value>10-0-86-51:26002</value>
  </property>
  <property>
    <name>tajo.resource-tracker.rpc.address</name>
    <value>10-0-86-51:26003</value>
  </property>
  <property>
    <name>tajo.catalog.client-rpc.address</name>
    <value>10-0-86-51:26005</value>
  </property>
  <property>
    <name>tajo.worker.tmpdir.locations</name>
    <value>/test/tajo1,/test/tajo2,/test/tajo3</value>
  </property>
  <!--  worker  -->
  <property>
    <name>tajo.worker.resource.tajo.worker.resource.cpu-cores</name>
    <value>4</value>
  </property>
 <property>
   <name>tajo.worker.resource.memory-mb</name>
   <value>5120</value>
 </property>
  <property>
    <name>tajo.worker.resource.dfs-dir-aware</name>
    <value>true</value>
  </property>
  <property>
    <name>tajo.worker.resource.dedicated</name>
    <value>true</value>
  </property>
  <property>
    <name>tajo.worker.resource.dedicated-memory-ratio</name>
    <value>0.6</value>
  </property>
</configuration>

On Fri, Jan 16, 2015 at 2:50 PM, Jinho Kim <jh...@apache.org> wrote:

> Hello Azuyy yu
>
> I left some comments.
>
> -Jinho
> Best regards
>
> 2015-01-16 14:37 GMT+09:00 Azuryy Yu <az...@gmail.com>:
>
> > Hi,
> >
> > I tested Tajo before half a year, then not focus on Tajo because some
> other
> > works.
> >
> > then I setup a small dev Tajo cluster this week.(six nodes, VM) based on
> > Hadoop-2.6.0.
> >
> > so my questions is:
> >
> > 1) From I know half a yea ago, Tajo is work on Yarn, using Yarn scheduler
> > to manage  job resources. but now I found it doesn't rely on Yarn,
> because
> > I only start HDFS daemons, no yarn daemons. so Tajo has his own job
> > sheduler ?
> >
> >
> Now, tajo does using own task scheduler. and  You can start tajo without
> Yarn daemons
> Please refer to http://tajo.apache.org/docs/0.9.0/configuration.html
>
>
> >
> > 2) Does that we need to put the file replications on every nodes on Tajo
> > cluster?
> >
>
> No, tajo does not need more replication.  if you set more replication, data
> locality can be increased
>
> such as I have a six nodes Tajo cluster, then should I set HDFS block
> > replication to six? because:
> >
> > I noticed when I run Tajo query, some nodes are busy, but some is free.
> > because the file's blocks are only located on these nodes. non others.
> >
> >
> In my opinion, you need to run balancer
>
> http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/HDFSCommands.html#balancer
>
>
> 3)the test data set is 4 million rows. nearly several GB. but it's very
> > slow when I runing: select count(distinct ID) from ****;
> > Any possible problems here?
> >
>
> Could you share tajo-env.sh, tajo-site.xml ?
>
>
> >
> >
> > Thanks
> >
>

Re: Some Tajo-0.9.0 questions

Posted by Jinho Kim <jh...@apache.org>.

Hello Azuyy yu

I left some comments.

-Jinho
Best regards

2015-01-16 14:37 GMT+09:00 Azuryy Yu <az...@gmail.com>:

> Hi,
>
> I tested Tajo before half a year, then not focus on Tajo because some other
> works.
>
> then I setup a small dev Tajo cluster this week.(six nodes, VM) based on
> Hadoop-2.6.0.
>
> so my questions is:
>
> 1) From I know half a yea ago, Tajo is work on Yarn, using Yarn scheduler
> to manage  job resources. but now I found it doesn't rely on Yarn, because
> I only start HDFS daemons, no yarn daemons. so Tajo has his own job
> sheduler ?
>
>
Now, tajo does using own task scheduler. and  You can start tajo without
Yarn daemons
Please refer to http://tajo.apache.org/docs/0.9.0/configuration.html


>
> 2) Does that we need to put the file replications on every nodes on Tajo
> cluster?
>

No, tajo does not need more replication.  if you set more replication, data
locality can be increased

such as I have a six nodes Tajo cluster, then should I set HDFS block
> replication to six? because:
>
> I noticed when I run Tajo query, some nodes are busy, but some is free.
> because the file's blocks are only located on these nodes. non others.
>
>
In my opinion, you need to run balancer
http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/HDFSCommands.html#balancer


3)the test data set is 4 million rows. nearly several GB. but it's very
> slow when I runing: select count(distinct ID) from ****;
> Any possible problems here?
>

Could you share tajo-env.sh, tajo-site.xml ?


>
>
> Thanks
>