You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Peter Vary (JIRA)" <ji...@apache.org> on 2016/09/01 22:14:20 UTC
[jira] [Comment Edited] (HIVE-14675) Ensure directories are cleaned
up on test cleanup in QTestUtil
[ https://issues.apache.org/jira/browse/HIVE-14675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15456776#comment-15456776 ]
Peter Vary edited comment on HIVE-14675 at 9/1/16 10:13 PM:
------------------------------------------------------------
Hi,
Yeah, this helped a lot. That one is in the main pom.xml here - these are run before the tests:
{noformat}
<plugins>
<!-- plugins are always listed in sorted order by groupId, artifectId -->
[..]
<execution>
<id>setup-test-dirs</id>
<phase>process-test-resources</phase>
<goals>
<goal>run</goal>
</goals>
<configuration>
<target>
<delete dir="${test.tmp.dir}" />
<delete dir="${test.warehouse.dir}" />
<mkdir dir="${test.tmp.dir}" />
<mkdir dir="${test.warehouse.dir}" />
<mkdir dir="${test.tmp.dir}/conf" />
<!-- copies hive-site.xml so it can be modified -->
<copy todir="${test.tmp.dir}/conf/">
<fileset dir="${basedir}/${hive.path.to.root}/data/conf/"/>
</copy>
</target>
</configuration>
</execution>
</executions>
</plugin>
[..]
</plugins>
{noformat}
As for the afterTestClass (it is typically just clearPostTestEffects, which only shuts down the zookeeper), so no deletion there.
I have found the following hive-site.xml-s:
- ./beeline/src/test/resources/hive-site.xml
- ./conf/hive-site.xml
- ./data/conf/hive-site.xml
- ./data/conf/llap/hive-site.xml
- ./data/conf/perf-reg/hive-site.xml
- ./data/conf/spark/standalone/hive-site.xml
- ./data/conf/spark/yarn-client/hive-site.xml
- ./data/conf/tez/hive-site.xml
- ./hcatalog/src/test/e2e/templeton/deployers/config/hive/hive-site.xml
The values I have found:
{noformat}
- ${test.tmp.dir}/metastore_db
- ${test.tmp.dir}/warehouse
- ${test.tmp.dir}/hadoop-tmp
- ${test.tmp.dir}/scratchdir
- ${test.tmp.dir}/localscratchdir/
- ${test.tmp.dir}/junit_metastore_db
- ${test.warehouse.dir}
- file://${test.tmp.dir}/metadb/
- ${test.tmp.dir}/log/
- ${test.tmp.dir}/tmp
- ${spark.home}/logs/ - In files: ./data/conf/spark/yarn-client/hive-site.xml, ./data/conf/spark/standalone/hive-site.xml
- /tmp/webhcat_e2e/logs/webhcat_e2e_metastore_db; - I file: ./hcatalog/src/test/e2e/templeton/deployers/config/hive/hive-site.xml
{noformat}
I have checked the other xml-s in the data/conf directory, but did not find any other directories there.
spark.home is defined in itests/hive-unit/pom.xml and qtest-spark/pom.xml:
{noformat}
<spark.home>${basedir}/${hive.path.to.root}/itests/qtest-spark/target/spark</spark.home>
{noformat}
To sum this up, it seems, that removing the warehouse, and the tmp directory after the tests is enough to remove files/directories created based on the test configurations. We have 2 cases where the directories are defined in the configuration and not removed:
{noformat}
- ${spark.home}/logs/ - this might not be a problem at all
- The metastore db for hcatalog tests might not be removed, which might cause some problem (but this is not QTestUtil :) )
{noformat}
During my code walkthroughs I found a java code creating a tmp dir for zookeeper in test.tmp.dir, but some other tests might use different directory patterns. Finding these is a harder nut to crack.
Is this the analysis you were looking for?
Thanks,
Peter
~Edit: Fighting with the formatter :)~
was (Author: pvary):
Hi,
Yeah, this helped a lot. That one is in the main pom.xml here - these are run before the tests:
{noformat}
<plugins>
<!-- plugins are always listed in sorted order by groupId, artifectId -->
[..]
<execution>
<id>setup-test-dirs</id>
<phase>process-test-resources</phase>
<goals>
<goal>run</goal>
</goals>
<configuration>
<target>
<delete dir="${test.tmp.dir}" />
<delete dir="${test.warehouse.dir}" />
<mkdir dir="${test.tmp.dir}" />
<mkdir dir="${test.warehouse.dir}" />
<mkdir dir="${test.tmp.dir}/conf" />
<!-- copies hive-site.xml so it can be modified -->
<copy todir="${test.tmp.dir}/conf/">
<fileset dir="${basedir}/${hive.path.to.root}/data/conf/"/>
</copy>
</target>
</configuration>
</execution>
</executions>
</plugin>
[..]
</plugins>
{noformat}
As for the afterTestClass (it is typically just clearPostTestEffects, which only shuts down the zookeeper), so no deletion there.
I have found the following hive-site.xml-s:
- ./beeline/src/test/resources/hive-site.xml
- ./conf/hive-site.xml
- ./data/conf/hive-site.xml
- ./data/conf/llap/hive-site.xml
- ./data/conf/perf-reg/hive-site.xml
- ./data/conf/spark/standalone/hive-site.xml
- ./data/conf/spark/yarn-client/hive-site.xml
- ./data/conf/tez/hive-site.xml
- ./hcatalog/src/test/e2e/templeton/deployers/config/hive/hive-site.xml
The values I have found:
- ${test.tmp.dir}/metastore_db
- ${test.tmp.dir}/warehouse
- ${test.tmp.dir}/hadoop-tmp
- ${test.tmp.dir}/scratchdir
- ${test.tmp.dir}/localscratchdir/
- ${test.tmp.dir}/junit_metastore_db
- ${test.warehouse.dir}
- file://${test.tmp.dir}/metadb/
- ${test.tmp.dir}/log/
- ${test.tmp.dir}/tmp
- ${spark.home}/logs/ - In files: ./data/conf/spark/yarn-client/hive-site.xml, ./data/conf/spark/standalone/hive-site.xml
- /tmp/webhcat_e2e/logs/webhcat_e2e_metastore_db; - I file: ./hcatalog/src/test/e2e/templeton/deployers/config/hive/hive-site.xml
I have checked the other xml-s in the data/conf directory, but did not find any other directories there.
spark.home is defined in itests/hive-unit/pom.xml and qtest-spark/pom.xml:
{noformat}
<spark.home>${basedir}/${hive.path.to.root}/itests/qtest-spark/target/spark</spark.home>
{noformat}
To sum this up, it seems, that removing the warehouse, and the tmp directory after the tests is enough to remove files/directories created based on the test configurations. We have 2 cases where the directories are defined in the configuration and not removed:
- ${spark.home}/logs/ - this might not be a problem at all
- The metastore db for hcatalog tests might not be removed, which might cause some problem (but this is not QTestUtil :) )
During my code walkthroughs I found a java code creating a tmp dir for zookeeper in test.tmp.dir, but some other tests might use different directory patterns. Finding these is a harder nut to crack.
Is this the analysis you were looking for?
Thanks,
Peter
> Ensure directories are cleaned up on test cleanup in QTestUtil
> --------------------------------------------------------------
>
> Key: HIVE-14675
> URL: https://issues.apache.org/jira/browse/HIVE-14675
> Project: Hive
> Issue Type: Sub-task
> Reporter: Siddharth Seth
>
> Need to verify whether they are cleaned up or not. There's 4-5 different directories involved. If I'm not mistaken, they get cleaned up before each test invocation via mvn.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)