You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tajo.apache.org by Hyunsik Choi <hy...@apache.org> on 2014/02/07 19:23:57 UTC
Review Request 17852: TAJO-548: Investigate frequent young gc
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17852/
-----------------------------------------------------------
Review request for Tajo.
Bugs: TAJO-548
https://issues.apache.org/jira/browse/TAJO-548
Repository: tajo
Description
-------
Two days ago, I ran a 300X TPCH Q1 on a 7 slaves , 1 master tajo clsuter.
Here is my configuration for each worker jvm
{noformat}
export TAJO_WORKER_OPTS="-Xmx20g -Xms20g -XX:MaxPermSize=512m -verbose:gc -Xloggc:$TAJO_LOG_DIR/worker-gc.log -XX:+UseConcMarkSweepGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+UseCompressedOops "
{noformat}
I observed very frequent young gc
{noformat}
$ jstat -gcutil `pgrep -f TajoWorker` 1000
S0 S1 E O P YGC YGCT FGC FGCT GCT
0.00 2.13 57.40 3.79 99.65 5185 75.042 0 0.000 75.042
0.00 2.13 57.40 3.79 99.65 5185 75.042 0 0.000 75.042
0.00 2.13 57.41 3.79 99.65 5185 75.042 0 0.000 75.042
0.00 2.13 57.42 3.79 99.65 5185 75.042 0 0.000 75.042
0.00 2.13 57.42 3.79 99.65 5185 75.042 0 0.000 75.042
0.00 2.13 57.43 3.79 99.65 5185 75.042 0 0.000 75.042
0.00 2.13 57.43 3.79 99.65 5185 75.042 0 0.000 75.042
0.00 2.13 57.43 3.79 99.65 5185 75.042 0 0.000 75.042
0.00 2.13 57.43 3.79 99.65 5185 75.042 0 0.000 75.042
0.00 2.13 57.43 3.79 99.65 5185 75.042 0 0.000 75.042
0.00 2.13 57.44 3.79 99.65 5185 75.042 0 0.000 75.042
0.00 2.13 57.53 3.79 99.65 5185 75.042 0 0.000 75.042
0.00 2.13 57.57 3.79 99.65 5185 75.042 0 0.000 75.042
0.00 2.13 57.58 3.79 99.65 5185 75.042 0 0.000 75.042
0.00 2.13 57.58 3.79 99.65 5185 75.042 0 0.000 75.042
0.00 2.13 57.58 3.79 99.65 5185 75.042 0 0.000 75.042
0.00 2.13 57.59 3.79 99.65 5185 75.042 0 0.000 75.042
0.00 2.13 57.59 3.79 99.65 5185 75.042 0 0.000 75.042
0.00 2.13 57.60 3.79 99.65 5185 75.042 0 0.000 75.042
0.00 2.13 57.60 3.79 99.65 5185 75.042 0 0.000 75.042
0.00 2.13 66.73 3.79 99.66 5185 75.042 0 0.000 75.042
0.00 66.46 17.89 3.79 99.66 5190 75.082 0 0.000 75.082
8.79 0.00 37.36 3.88 99.66 5202 75.533 0 0.000 75.533
0.00 9.10 21.61 3.88 99.66 5216 75.926 0 0.000 75.926
... 105.523
{noformat}
After finish the Q1 query, GCT is at 105.523, which means young gc cost 30 seconds. As well known, ParNew YGC is a kind of stop the world gc. The whole query need about 98 secs. So in this case, young gc waste nearly 1 third of the query time.
Diffs
-----
tajo-common/src/main/java/org/apache/tajo/util/Bytes.java 0b5a4706efa216aea86f2abd8ea6645bdbb24233
tajo-common/src/test/java/org/apache/tajo/util/TestBytes.java 98d8eac36347a34ccef35a49f44c692925f32918
tajo-core/tajo-core-backend/src/test/resources/queries/TestCTASQuery/CtasWithGroupby.sql 7c3895ee781e94e57d6710d8b838ea5ea2d83fd4
tajo-core/tajo-core-backend/src/test/resources/queries/TestCTASQuery/CtasWithLimit.sql 9c99a66220c2670b3a4c25d2c2274e8a2bbd4187
tajo-core/tajo-core-backend/src/test/resources/queries/TestCTASQuery/CtasWithOrderby.sql e9ab88d2be8fa450f75abf6d16b1085e4e4cf131
tajo-core/tajo-core-backend/src/test/resources/queries/TestCTASQuery/CtasWithUnion.sql c9d68886385a87cf844ed51f526a2c4b508ae4e3
tajo-core/tajo-core-backend/src/test/resources/queries/TestCTASQuery/testCtasWithColumnedPartition.sql d5f89d690fef9fe32f4c05112826fa32e7dcc9df
tajo-core/tajo-core-backend/src/test/resources/results/TestCTASQuery/testCtasWithGroupby.result fe96e1428754e7af6dd4bdab0b63c25ed2bc12a0
tajo-core/tajo-core-backend/src/test/resources/results/TestCTASQuery/testCtasWithLimit.result b9bc210bfbba39ef7153f12377c39ddba6b15d04
tajo-core/tajo-core-backend/src/test/resources/results/TestCTASQuery/testCtasWithOrderby.result fe96e1428754e7af6dd4bdab0b63c25ed2bc12a0
tajo-storage/src/main/java/org/apache/tajo/storage/TextSerializerDeserializer.java 07ea79b09041e17e6aec276d3fd5cbccfd143a1c
Diff: https://reviews.apache.org/r/17852/diff/
Testing
-------
Thanks,
Hyunsik Choi
Re: Review Request 17852: TAJO-548: Investigate frequent young gc
Posted by Jihoon Son <ji...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17852/#review34014
-----------------------------------------------------------
This patch looks good to me. I'll test it on my in-house cluster.
- Jihoon Son
On Feb. 8, 2014, 3:23 a.m., Hyunsik Choi wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/17852/
> -----------------------------------------------------------
>
> (Updated Feb. 8, 2014, 3:23 a.m.)
>
>
> Review request for Tajo.
>
>
> Bugs: TAJO-548
> https://issues.apache.org/jira/browse/TAJO-548
>
>
> Repository: tajo
>
>
> Description
> -------
>
> Two days ago, I ran a 300X TPCH Q1 on a 7 slaves , 1 master tajo clsuter.
> Here is my configuration for each worker jvm
> {noformat}
> export TAJO_WORKER_OPTS="-Xmx20g -Xms20g -XX:MaxPermSize=512m -verbose:gc -Xloggc:$TAJO_LOG_DIR/worker-gc.log -XX:+UseConcMarkSweepGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+UseCompressedOops "
> {noformat}
>
> I observed very frequent young gc
> {noformat}
> $ jstat -gcutil `pgrep -f TajoWorker` 1000
> S0 S1 E O P YGC YGCT FGC FGCT GCT
> 0.00 2.13 57.40 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.40 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.41 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.42 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.42 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.43 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.43 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.43 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.43 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.43 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.44 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.53 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.57 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.58 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.58 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.58 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.59 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.59 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.60 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.60 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 66.73 3.79 99.66 5185 75.042 0 0.000 75.042
> 0.00 66.46 17.89 3.79 99.66 5190 75.082 0 0.000 75.082
> 8.79 0.00 37.36 3.88 99.66 5202 75.533 0 0.000 75.533
> 0.00 9.10 21.61 3.88 99.66 5216 75.926 0 0.000 75.926
> ... 105.523
> {noformat}
>
> After finish the Q1 query, GCT is at 105.523, which means young gc cost 30 seconds. As well known, ParNew YGC is a kind of stop the world gc. The whole query need about 98 secs. So in this case, young gc waste nearly 1 third of the query time.
>
>
> Diffs
> -----
>
> tajo-common/src/main/java/org/apache/tajo/util/Bytes.java 0b5a4706efa216aea86f2abd8ea6645bdbb24233
> tajo-common/src/test/java/org/apache/tajo/util/TestBytes.java 98d8eac36347a34ccef35a49f44c692925f32918
> tajo-core/tajo-core-backend/src/test/resources/queries/TestCTASQuery/CtasWithGroupby.sql 7c3895ee781e94e57d6710d8b838ea5ea2d83fd4
> tajo-core/tajo-core-backend/src/test/resources/queries/TestCTASQuery/CtasWithLimit.sql 9c99a66220c2670b3a4c25d2c2274e8a2bbd4187
> tajo-core/tajo-core-backend/src/test/resources/queries/TestCTASQuery/CtasWithOrderby.sql e9ab88d2be8fa450f75abf6d16b1085e4e4cf131
> tajo-core/tajo-core-backend/src/test/resources/queries/TestCTASQuery/CtasWithUnion.sql c9d68886385a87cf844ed51f526a2c4b508ae4e3
> tajo-core/tajo-core-backend/src/test/resources/queries/TestCTASQuery/testCtasWithColumnedPartition.sql d5f89d690fef9fe32f4c05112826fa32e7dcc9df
> tajo-core/tajo-core-backend/src/test/resources/results/TestCTASQuery/testCtasWithGroupby.result fe96e1428754e7af6dd4bdab0b63c25ed2bc12a0
> tajo-core/tajo-core-backend/src/test/resources/results/TestCTASQuery/testCtasWithLimit.result b9bc210bfbba39ef7153f12377c39ddba6b15d04
> tajo-core/tajo-core-backend/src/test/resources/results/TestCTASQuery/testCtasWithOrderby.result fe96e1428754e7af6dd4bdab0b63c25ed2bc12a0
> tajo-storage/src/main/java/org/apache/tajo/storage/TextSerializerDeserializer.java 07ea79b09041e17e6aec276d3fd5cbccfd143a1c
>
> Diff: https://reviews.apache.org/r/17852/diff/
>
>
> Testing
> -------
>
>
> Thanks,
>
> Hyunsik Choi
>
>
Re: Review Request 17852: TAJO-548: Investigate frequent young gc
Posted by Jihoon Son <ji...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17852/#review34254
-----------------------------------------------------------
Ship it!
+1. The patch looks good to me.
I tested the patch in our cluster.
Actually, I couldn't find significant improvements for the gc overhead, but TPC-H Q1 query maybe was not a heavy query in our cluster.
However, I expect that this patch will reduce the gc overhead significantly when users run queries which require very heavy computations.
- Jihoon Son
On Feb. 8, 2014, 3:23 a.m., Hyunsik Choi wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/17852/
> -----------------------------------------------------------
>
> (Updated Feb. 8, 2014, 3:23 a.m.)
>
>
> Review request for Tajo.
>
>
> Bugs: TAJO-548
> https://issues.apache.org/jira/browse/TAJO-548
>
>
> Repository: tajo
>
>
> Description
> -------
>
> Two days ago, I ran a 300X TPCH Q1 on a 7 slaves , 1 master tajo clsuter.
> Here is my configuration for each worker jvm
> {noformat}
> export TAJO_WORKER_OPTS="-Xmx20g -Xms20g -XX:MaxPermSize=512m -verbose:gc -Xloggc:$TAJO_LOG_DIR/worker-gc.log -XX:+UseConcMarkSweepGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+UseCompressedOops "
> {noformat}
>
> I observed very frequent young gc
> {noformat}
> $ jstat -gcutil `pgrep -f TajoWorker` 1000
> S0 S1 E O P YGC YGCT FGC FGCT GCT
> 0.00 2.13 57.40 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.40 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.41 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.42 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.42 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.43 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.43 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.43 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.43 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.43 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.44 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.53 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.57 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.58 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.58 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.58 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.59 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.59 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.60 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 57.60 3.79 99.65 5185 75.042 0 0.000 75.042
> 0.00 2.13 66.73 3.79 99.66 5185 75.042 0 0.000 75.042
> 0.00 66.46 17.89 3.79 99.66 5190 75.082 0 0.000 75.082
> 8.79 0.00 37.36 3.88 99.66 5202 75.533 0 0.000 75.533
> 0.00 9.10 21.61 3.88 99.66 5216 75.926 0 0.000 75.926
> ... 105.523
> {noformat}
>
> After finish the Q1 query, GCT is at 105.523, which means young gc cost 30 seconds. As well known, ParNew YGC is a kind of stop the world gc. The whole query need about 98 secs. So in this case, young gc waste nearly 1 third of the query time.
>
>
> Diffs
> -----
>
> tajo-common/src/main/java/org/apache/tajo/util/Bytes.java 0b5a4706efa216aea86f2abd8ea6645bdbb24233
> tajo-common/src/test/java/org/apache/tajo/util/TestBytes.java 98d8eac36347a34ccef35a49f44c692925f32918
> tajo-core/tajo-core-backend/src/test/resources/queries/TestCTASQuery/CtasWithGroupby.sql 7c3895ee781e94e57d6710d8b838ea5ea2d83fd4
> tajo-core/tajo-core-backend/src/test/resources/queries/TestCTASQuery/CtasWithLimit.sql 9c99a66220c2670b3a4c25d2c2274e8a2bbd4187
> tajo-core/tajo-core-backend/src/test/resources/queries/TestCTASQuery/CtasWithOrderby.sql e9ab88d2be8fa450f75abf6d16b1085e4e4cf131
> tajo-core/tajo-core-backend/src/test/resources/queries/TestCTASQuery/CtasWithUnion.sql c9d68886385a87cf844ed51f526a2c4b508ae4e3
> tajo-core/tajo-core-backend/src/test/resources/queries/TestCTASQuery/testCtasWithColumnedPartition.sql d5f89d690fef9fe32f4c05112826fa32e7dcc9df
> tajo-core/tajo-core-backend/src/test/resources/results/TestCTASQuery/testCtasWithGroupby.result fe96e1428754e7af6dd4bdab0b63c25ed2bc12a0
> tajo-core/tajo-core-backend/src/test/resources/results/TestCTASQuery/testCtasWithLimit.result b9bc210bfbba39ef7153f12377c39ddba6b15d04
> tajo-core/tajo-core-backend/src/test/resources/results/TestCTASQuery/testCtasWithOrderby.result fe96e1428754e7af6dd4bdab0b63c25ed2bc12a0
> tajo-storage/src/main/java/org/apache/tajo/storage/TextSerializerDeserializer.java 07ea79b09041e17e6aec276d3fd5cbccfd143a1c
>
> Diff: https://reviews.apache.org/r/17852/diff/
>
>
> Testing
> -------
>
>
> Thanks,
>
> Hyunsik Choi
>
>