You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Siddharth Seth (JIRA)" <ji...@apache.org> on 2016/05/28 19:31:12 UTC

[jira] [Comment Edited] (HIVE-13511) Run clidriver tests from within the qtest dir for the precommit tests

    [ https://issues.apache.org/jira/browse/HIVE-13511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15305555#comment-15305555 ] 

Siddharth Seth edited comment on HIVE-13511 at 5/28/16 7:30 PM:
----------------------------------------------------------------

Thanks [~spena]. I think the problem introduced by this patch is now resolved.

However, the new runs are seeing a lot more failures. There's some interesting stuff in the logs
1. It looks like test runs are getting killed randomly (I believe this problem existed earlier as well ?)
2. Some tests fail  with not enough memory available for the VM to continue
{code}
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (mmap) failed to map 105381888 bytes for committing reserved memory.
# An error report file with more information is saved as:
# /home/hiveptest/54.177.86.160-hiveptest-1/apache-github-source-source/itests/qtest-spark/hs_err_pid9151.log
{code}

I wonder if the random kills are related as well. i.e. at some point the box gets a little overloaded and the OS decides to kill processes ? syslog output on the systems would be interesting to look at.

From looking at various process sizes
maven seems to be 2GB heap per test (potentially 2 processes per testt?)
TestMiniTezCliDriver - runs AMs and Containers with a heap of 819M
Spark tests run AMs and executors with a heap of 1G each

I can look at reducing the size occupied by Tez AMs and containers. Given the small amount of data processed in tests - this change seems reasonable. Can the same be done for Spark ?

Alternately, should we try reducing the number of drones per box to 2 instead of 3, and see how that affects the system ?



was (Author: sseth):
Thanks [~spena]. I think the problem introduced by this patch is now resolved.

However, the new runs are seeing a lot more failures. There's some interesting stuff in the logs
1. It looks like test runs are getting killed randomly (I believe this problem existed earlier as well ?)
2. Some tests fail to even start with not enough memory available for the VM to continue
{code}
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (mmap) failed to map 105381888 bytes for committing reserved memory.
# An error report file with more information is saved as:
# /home/hiveptest/54.177.86.160-hiveptest-1/apache-github-source-source/itests/qtest-spark/hs_err_pid9151.log
{code}

I wonder if the random kills are related as well. i.e. at some point the box gets a little overloaded and the OS decides to kill processes ? syslog output on the systems would be interesting to look at.

From looking at various process sizes
maven seems to be 2GB heap per test (potentially 2 processes per testt?)
TestMiniTezCliDriver - runs AMs and Containers with a heap of 819M
Spark tests run AMs and executors with a heap of 1G each

I can look at reducing the size occupied by Tez AMs and containers. Given the small amount of data processed in tests - this change seems reasonable. Can the same be done for Spark ?

Alternately, should we try reducing the number of drones per box to 2 instead of 3, and see how that affects the system ?


> Run clidriver tests from within the qtest dir for the precommit tests
> ---------------------------------------------------------------------
>
>                 Key: HIVE-13511
>                 URL: https://issues.apache.org/jira/browse/HIVE-13511
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: Siddharth Seth
>            Assignee: Siddharth Seth
>             Fix For: 2.1.0
>
>         Attachments: HIVE-13511.01.patch, HIVE-13511.02.patch, HIVE-13511.03.addendum.patch, HIVE-13511.03.patch, example_maven-test.txt, example_testExecution.txt, failedScriptPostPatch.txt
>
>
> The tests are currently run from the itests directory - which means there's additional overhead of having to at least check whether files have changed. Will attach a sample output - this adds up to 40+ seconds per batch. Getting rid of this should be a reasonable saving overall.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)