You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Cheolsoo Park (JIRA)" <ji...@apache.org> on 2014/02/26 03:07:19 UTC

[jira] [Updated] (PIG-3780) Tez mini cluster tests run for a very long time with TezSession reuse on

     [ https://issues.apache.org/jira/browse/PIG-3780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Cheolsoo Park updated PIG-3780:
-------------------------------

    Attachment: PIG-3780-1.patch

In the attached patch, I am doing three things-
# Set "tez.session.reuse" to false in properties object and reuse its reference to start PigServer everywhere. I noticed that cluster.getProperties() returns a copy of object rather than a reference to it, so setting any properties without keeping the reference never takes effect.
# TestCustomerPartitioner was failing because "part-r-00000" was hardcoded in test cases. But the Tez output filename doesn't follow this convention. I changed the test to use FileStatus instead.
# TestTezCompiler was broken due to mismatch in gold files. This is probably because we haven't run unit tests for a while. I regenerated them to reflect recent changes.

I have confirmed all the unit tests in test-tez pass.

> Tez mini cluster tests run for a very long time with TezSession reuse on
> ------------------------------------------------------------------------
>
>                 Key: PIG-3780
>                 URL: https://issues.apache.org/jira/browse/PIG-3780
>             Project: Pig
>          Issue Type: Sub-task
>          Components: tez
>    Affects Versions: tez-branch
>            Reporter: Cheolsoo Park
>            Assignee: Cheolsoo Park
>             Fix For: tez-branch
>
>         Attachments: PIG-3780-1.patch
>
>
> In the current tez branch, mini cluster unit tests are very slow. The reason is as follows:
> * TezSession reuse is by default on.
> * Each test case runs, and it waits for Tez AM to terminate.
> *  After Tez AM times out (usually after several minutes), another test case runs.
> Two questions that I have are:
> # Why doesn't TezSession reuse work in mini cluster?
> # Why is TezSession reuse not disabled in some tests (e.g. TestAccumulator) where we explicitly set "tez.session.reuse" to false?
> As for #2, I realized that "tez.session.reuse" was never set in the properties object that is passed to PigServer. I am going to upload a patch that fixes this problem in this jira.
> As for #1, I don't have an answer yet. But I think we can fix this in a separate jira once we get Tez unit tests working again.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)