You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by Ivan Veselovsky <iv...@griddynamics.com> on 2012/09/12 10:31:40 UTC

Review Request: PIG-2898: allow to run pig e2e tests in parallel mode.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7053/
-----------------------------------------------------------

Review request for pig and Rohini Palaniswamy.


Description
-------

Please see https://issues.apache.org/jira/browse/PIG-2898 for details.


This addresses bug https://issues.apache.org/jira/browse/PIG-2898.
    https://issues.apache.org/jira/browse/https://issues.apache.org/jira/browse/PIG-2898


Diffs
-----

  http://svn.apache.org/repos/asf/pig/trunk/test/e2e/harness/TestDriver.pm 1383357 
  http://svn.apache.org/repos/asf/pig/trunk/test/e2e/harness/test_harness.pl 1383357 
  http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/build.xml 1383357 
  http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/deployers/ExistingClusterDeployer.pm 1383357 
  http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/drivers/TestDriverPig.pm 1383357 

Diff: https://reviews.apache.org/r/7053/diff/


Testing
-------

Tested e2e tests execution in both sequential (default) and parallel modes.
The test run duration measurement data (in dependency on the fork parameters) will be available soon.


Thanks,

Ivan Veselovsky


Re: Review Request: PIG-2898: allow to run pig e2e tests in parallel mode.

Posted by Rohini Palaniswamy <ro...@gmail.com>.

> On Oct. 4, 2012, 5:55 p.m., Rohini Palaniswamy wrote:
> > +1 non-binding. 
> > 
> > The tests passed fine for H20 and H23 when I ran them for branch-0.10. It hangs in local mode when run with Y! distribution of pig using the -Dpig.dir option. I think we can address that in a separate jira.  
> > 
> > The patch does not apply cleanly on 0.10 due to PERL5LIB recent change. You will have to create one for 0.10. Also need to upload this patch in the jira.

The problem with Y! distribution of pig was that it was not picking up PIG_OPTS. Worked fine after fixing that. 


- Rohini


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7053/#review12175
-----------------------------------------------------------


On Oct. 3, 2012, 12:53 p.m., Ivan Veselovsky wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/7053/
> -----------------------------------------------------------
> 
> (Updated Oct. 3, 2012, 12:53 p.m.)
> 
> 
> Review request for pig and Rohini Palaniswamy.
> 
> 
> Description
> -------
> 
> Please see https://issues.apache.org/jira/browse/PIG-2898 for details.
> 
> 
> This addresses bug https://issues.apache.org/jira/browse/PIG-2898.
>     https://issues.apache.org/jira/browse/https://issues.apache.org/jira/browse/PIG-2898
> 
> 
> Diffs
> -----
> 
>   http://svn.apache.org/repos/asf/pig/trunk/test/e2e/harness/TestDriver.pm 1393450 
>   http://svn.apache.org/repos/asf/pig/trunk/test/e2e/harness/test_harness.pl 1393450 
>   http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/build.xml 1393450 
>   http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/conf/local.conf 1393450 
>   http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/deployers/ExistingClusterDeployer.pm 1393450 
>   http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/drivers/TestDriverPig.pm 1393450 
> 
> Diff: https://reviews.apache.org/r/7053/diff/
> 
> 
> Testing
> -------
> 
> Tested e2e tests execution in both sequential (default) and parallel modes.
> The test run duration measurement data (in dependency on the fork parameters) will be available soon.
> 
> 
> Thanks,
> 
> Ivan Veselovsky
> 
>


Re: Review Request: PIG-2898: allow to run pig e2e tests in parallel mode.

Posted by Rohini Palaniswamy <ro...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7053/#review12175
-----------------------------------------------------------

Ship it!


+1 non-binding. 

The tests passed fine for H20 and H23 when I ran them for branch-0.10. It hangs in local mode when run with Y! distribution of pig using the -Dpig.dir option. I think we can address that in a separate jira.  

The patch does not apply cleanly on 0.10 due to PERL5LIB recent change. You will have to create one for 0.10. Also need to upload this patch in the jira. 

- Rohini Palaniswamy


On Oct. 3, 2012, 12:53 p.m., Ivan Veselovsky wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/7053/
> -----------------------------------------------------------
> 
> (Updated Oct. 3, 2012, 12:53 p.m.)
> 
> 
> Review request for pig and Rohini Palaniswamy.
> 
> 
> Description
> -------
> 
> Please see https://issues.apache.org/jira/browse/PIG-2898 for details.
> 
> 
> This addresses bug https://issues.apache.org/jira/browse/PIG-2898.
>     https://issues.apache.org/jira/browse/https://issues.apache.org/jira/browse/PIG-2898
> 
> 
> Diffs
> -----
> 
>   http://svn.apache.org/repos/asf/pig/trunk/test/e2e/harness/TestDriver.pm 1393450 
>   http://svn.apache.org/repos/asf/pig/trunk/test/e2e/harness/test_harness.pl 1393450 
>   http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/build.xml 1393450 
>   http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/conf/local.conf 1393450 
>   http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/deployers/ExistingClusterDeployer.pm 1393450 
>   http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/drivers/TestDriverPig.pm 1393450 
> 
> Diff: https://reviews.apache.org/r/7053/diff/
> 
> 
> Testing
> -------
> 
> Tested e2e tests execution in both sequential (default) and parallel modes.
> The test run duration measurement data (in dependency on the fork parameters) will be available soon.
> 
> 
> Thanks,
> 
> Ivan Veselovsky
> 
>


Re: Review Request: PIG-2898: allow to run pig e2e tests in parallel mode.

Posted by Ivan Veselovsky <iv...@griddynamics.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7053/
-----------------------------------------------------------

(Updated Oct. 25, 2012, 5:21 p.m.)


Review request for pig and Rohini Palaniswamy.


Changes
-------

Hi, Daniel, Rohini,
I implemented the required optimization which ensures that the local and HDFS directories are created only when needed (on demand).
These changes are in newly attached "PIG-2898-trunk-7.patch".

The idea of the fix is that we splitted methods #globalSetup() and #globalCleanup() into 2 parts: new methods #globalSetup2() and #globalClenup2() methods introduced. The method #globalSetup2() only invoked if there is some test to execute, and #globalCleanup2() is only invoked if #globalSetup2() was invoked.

Also in this patch I reverted one of previous changes that changed IPC::Run::run('mkdir' ...) to "mkpath" perl call because "mkpath" appears to have (at lest on my perl implementation 5.14.2) quite strange feature: it returns non-zero exit status with "No such file or directory" message if the directory we're attempting to create already exists. This behavior is unexpected and confusing because it contradicts to native "mkdir -p" and java.io.File#mkdirs() behavior. So, despite of the fact that IPC::Run::run is slower, I prefer to use it to avoid developer's trouble.


Description
-------

Please see https://issues.apache.org/jira/browse/PIG-2898 for details.


This addresses bug https://issues.apache.org/jira/browse/PIG-2898.
    https://issues.apache.org/jira/browse/https://issues.apache.org/jira/browse/PIG-2898


Diffs (updated)
-----

  http://svn.apache.org/repos/asf/pig/trunk/test/e2e/harness/TestDriver.pm 1402191 
  http://svn.apache.org/repos/asf/pig/trunk/test/e2e/harness/test_harness.pl 1402191 
  http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/build.xml 1402191 
  http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/conf/local.conf 1402191 
  http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/deployers/ExistingClusterDeployer.pm 1402191 
  http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/drivers/TestDriverPig.pm 1402191 

Diff: https://reviews.apache.org/r/7053/diff/


Testing
-------

Tested e2e tests execution in both sequential (default) and parallel modes.
The test run duration measurement data (in dependency on the fork parameters) will be available soon.


Thanks,

Ivan Veselovsky


Re: Review Request: PIG-2898: allow to run pig e2e tests in parallel mode.

Posted by Ivan Veselovsky <iv...@griddynamics.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7053/
-----------------------------------------------------------

(Updated Oct. 3, 2012, 12:53 p.m.)


Review request for pig and Rohini Palaniswamy.


Changes
-------

patch #6 that fixes the last comments to patch #5.


Description
-------

Please see https://issues.apache.org/jira/browse/PIG-2898 for details.


This addresses bug https://issues.apache.org/jira/browse/PIG-2898.
    https://issues.apache.org/jira/browse/https://issues.apache.org/jira/browse/PIG-2898


Diffs (updated)
-----

  http://svn.apache.org/repos/asf/pig/trunk/test/e2e/harness/TestDriver.pm 1393450 
  http://svn.apache.org/repos/asf/pig/trunk/test/e2e/harness/test_harness.pl 1393450 
  http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/build.xml 1393450 
  http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/conf/local.conf 1393450 
  http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/deployers/ExistingClusterDeployer.pm 1393450 
  http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/drivers/TestDriverPig.pm 1393450 

Diff: https://reviews.apache.org/r/7053/diff/


Testing
-------

Tested e2e tests execution in both sequential (default) and parallel modes.
The test run duration measurement data (in dependency on the fork parameters) will be available soon.


Thanks,

Ivan Veselovsky


Re: Review Request: PIG-2898: allow to run pig e2e tests in parallel mode.

Posted by Ivan Veselovsky <iv...@griddynamics.com>.

> On Oct. 2, 2012, 6:23 p.m., Rohini Palaniswamy wrote:
> > http://svn.apache.org/repos/asf/pig/trunk/test/e2e/harness/TestDriver.pm, lines 89-95
> > <https://reviews.apache.org/r/7053/diff/2/?file=160775#file160775line89>
> >
> >     Any reason for this length based formatting? In fact it makes the string even more longer than before. Just curious to know why this is being done.

The reason is to have no horizontal shifts in the result table. With the formatting we have
..... PASSED: 1    FAILED: 0    SKIPPED: 0    ....
.....
..... PASSED: 999  FAILED: 15   SKIPPED: 18   ....

, while without the formatting we would have
..... PASSED: 1 FAILED: 0 SKIPPED: 0    ....
.....
..... PASSED: 999 FAILED: 15 SKIPPED: 18   ....


> On Oct. 2, 2012, 6:23 p.m., Rohini Palaniswamy wrote:
> > http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/drivers/TestDriverPig.pm, line 378
> > <https://reviews.apache.org/r/7053/diff/2/?file=160780#file160780line378>
> >
> >     1024m instead of 1025m

1025 was set for debug purposes: this allows to know exactly where the setting comes from. :)
Fixed to be 1024 back.


> On Oct. 2, 2012, 6:23 p.m., Rohini Palaniswamy wrote:
> > http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/drivers/TestDriverPig.pm, lines 383-385
> > <https://reviews.apache.org/r/7053/diff/2/?file=160780#file160780line383>
> >
> >     Nice to see someone fix this one. Has always bothered my eyes while looking at the logs too see it appended so many times :). Can we remove the code altogether instead of commenting out.

done.


> On Oct. 2, 2012, 6:23 p.m., Rohini Palaniswamy wrote:
> > http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/tests/streaming_local.conf, lines 179-189
> > <https://reviews.apache.org/r/7053/diff/2/?file=160781#file160781line179>
> >
> >     Instead of commenting this out in this patch, we can create a new jira so that this test can be fixed. TestStreaming unit tests also are timing out in trunk. So there might be a bug with streaming that needs to be fixed before 0.11 is released. 
> >     
> >     The test runs fine in 0.10 in seq mode. Removing this will also make the patch apply cleanly on 0.10 else we need to create a patch for 0.10.

Actually this change was included into the patch just by accident: i just commented this test to avoid the hangup.
The issue with this test seems to be out of scope of the parallel e2e task, so, sure, I exclude this change from the patch.


> On Oct. 2, 2012, 6:23 p.m., Rohini Palaniswamy wrote:
> > http://svn.apache.org/repos/asf/pig/trunk/test/e2e/harness/test_harness.pl, lines 292-297
> > <https://reviews.apache.org/r/7053/diff/1-2/?file=152854#file152854line292>
> >
> >     Can't we do this by getting $ENV{'HADOOP_MAPRED_LOCAL_DIR'} in local.conf? Or is there some reason for doing it this way?

The only reason is backward compatibility: if somebody had non-ant script to run test_harness.pl, that script has no HADOOP_MAPRED_LOCAL_DIR env variable set, so, here we provide the default in the local.conf. 
I re-implemented this in the following way: the default in local.conf is present, but used only conditionally, if the corresponding ENV key is not defiled. And the ENV value is passed through ant variable, which, in turn, also has reasonable default.


> On Oct. 2, 2012, 6:23 p.m., Rohini Palaniswamy wrote:
> > http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/build.xml, line 251
> > <https://reviews.apache.org/r/7053/diff/2/?file=160777#file160777line251>
> >
> >     Default /tmp/hadoop/mapred/local/?

please see the 1st comment in this review above.


> On Oct. 2, 2012, 6:23 p.m., Rohini Palaniswamy wrote:
> > http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/conf/local.conf, line 61
> > <https://reviews.apache.org/r/7053/diff/2/?file=160778#file160778line61>
> >
> >     Same as prev 2 comments. Just repeating it in context. Can this be ENV{HADOOP_MAPRED_LOCAL_DIR}  and define /tmp/hadoop/mapred/local as the default in build.xml similar to other variables in this conf?

please see the 1st comment in this review above.


- Ivan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7053/#review12079
-----------------------------------------------------------


On Sept. 28, 2012, 12:20 p.m., Ivan Veselovsky wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/7053/
> -----------------------------------------------------------
> 
> (Updated Sept. 28, 2012, 12:20 p.m.)
> 
> 
> Review request for pig and Rohini Palaniswamy.
> 
> 
> Description
> -------
> 
> Please see https://issues.apache.org/jira/browse/PIG-2898 for details.
> 
> 
> This addresses bug https://issues.apache.org/jira/browse/PIG-2898.
>     https://issues.apache.org/jira/browse/https://issues.apache.org/jira/browse/PIG-2898
> 
> 
> Diffs
> -----
> 
>   http://svn.apache.org/repos/asf/pig/trunk/test/e2e/harness/TestDriver.pm 1390615 
>   http://svn.apache.org/repos/asf/pig/trunk/test/e2e/harness/test_harness.pl 1390615 
>   http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/build.xml 1390615 
>   http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/conf/local.conf 1390615 
>   http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/deployers/ExistingClusterDeployer.pm 1390615 
>   http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/drivers/TestDriverPig.pm 1390615 
>   http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/tests/streaming_local.conf 1390615 
> 
> Diff: https://reviews.apache.org/r/7053/diff/
> 
> 
> Testing
> -------
> 
> Tested e2e tests execution in both sequential (default) and parallel modes.
> The test run duration measurement data (in dependency on the fork parameters) will be available soon.
> 
> 
> Thanks,
> 
> Ivan Veselovsky
> 
>


Re: Review Request: PIG-2898: allow to run pig e2e tests in parallel mode.

Posted by Rohini Palaniswamy <ro...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7053/#review12079
-----------------------------------------------------------


Just few minor comments. 

I ran the patch on branch-0.10 after reverting the commented out test in  steaming_local.conf. It passed fine in Hadoop 1.0.2 but hung in hadoop 0.23. With the previous patch changing to -Dmapreduce.cluster.local.dir it was passing in Hadoop 0.23. I am trying to debug that. 


http://svn.apache.org/repos/asf/pig/trunk/test/e2e/harness/test_harness.pl
<https://reviews.apache.org/r/7053/#comment25774>

    Can't we do this by getting $ENV{'HADOOP_MAPRED_LOCAL_DIR'} in local.conf? Or is there some reason for doing it this way?



http://svn.apache.org/repos/asf/pig/trunk/test/e2e/harness/TestDriver.pm
<https://reviews.apache.org/r/7053/#comment25777>

    Any reason for this length based formatting? In fact it makes the string even more longer than before. Just curious to know why this is being done. 



http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/build.xml
<https://reviews.apache.org/r/7053/#comment25741>

    Default /tmp/hadoop/mapred/local/?



http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/conf/local.conf
<https://reviews.apache.org/r/7053/#comment25740>

    Same as prev 2 comments. Just repeating it in context. Can this be ENV{HADOOP_MAPRED_LOCAL_DIR}  and define /tmp/hadoop/mapred/local as the default in build.xml similar to other variables in this conf? 



http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/drivers/TestDriverPig.pm
<https://reviews.apache.org/r/7053/#comment25793>

    1024m instead of 1025m



http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/drivers/TestDriverPig.pm
<https://reviews.apache.org/r/7053/#comment25739>

    Nice to see someone fix this one. Has always bothered my eyes while looking at the logs too see it appended so many times :). Can we remove the code altogether instead of commenting out. 



http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/tests/streaming_local.conf
<https://reviews.apache.org/r/7053/#comment25738>

    Instead of commenting this out in this patch, we can create a new jira so that this test can be fixed. TestStreaming unit tests also are timing out in trunk. So there might be a bug with streaming that needs to be fixed before 0.11 is released. 
    
    The test runs fine in 0.10 in seq mode. Removing this will also make the patch apply cleanly on 0.10 else we need to create a patch for 0.10.


- Rohini Palaniswamy


On Sept. 28, 2012, 12:20 p.m., Ivan Veselovsky wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/7053/
> -----------------------------------------------------------
> 
> (Updated Sept. 28, 2012, 12:20 p.m.)
> 
> 
> Review request for pig and Rohini Palaniswamy.
> 
> 
> Description
> -------
> 
> Please see https://issues.apache.org/jira/browse/PIG-2898 for details.
> 
> 
> This addresses bug https://issues.apache.org/jira/browse/PIG-2898.
>     https://issues.apache.org/jira/browse/https://issues.apache.org/jira/browse/PIG-2898
> 
> 
> Diffs
> -----
> 
>   http://svn.apache.org/repos/asf/pig/trunk/test/e2e/harness/TestDriver.pm 1390615 
>   http://svn.apache.org/repos/asf/pig/trunk/test/e2e/harness/test_harness.pl 1390615 
>   http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/build.xml 1390615 
>   http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/conf/local.conf 1390615 
>   http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/deployers/ExistingClusterDeployer.pm 1390615 
>   http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/drivers/TestDriverPig.pm 1390615 
>   http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/tests/streaming_local.conf 1390615 
> 
> Diff: https://reviews.apache.org/r/7053/diff/
> 
> 
> Testing
> -------
> 
> Tested e2e tests execution in both sequential (default) and parallel modes.
> The test run duration measurement data (in dependency on the fork parameters) will be available soon.
> 
> 
> Thanks,
> 
> Ivan Veselovsky
> 
>


Re: Review Request: PIG-2898: allow to run pig e2e tests in parallel mode.

Posted by Ivan Veselovsky <iv...@griddynamics.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7053/
-----------------------------------------------------------

(Updated Sept. 28, 2012, 12:20 p.m.)


Review request for pig and Rohini Palaniswamy.


Changes
-------

Hi, Rohini,
all the mentioned suggestions were adderessed in the patch #5. This patch is cumulative: it aggregates all the changes made in previous patches.

Notes:

   - In parallellized mode context (like "[myfile.conf-MyGroup]" is printed after the results due to formatting issues (some contexts are too long).

   - In trunk branch test streaming_local.conf/StreamingLocal_11 hangs in local mode (observed in both sequential and parallel execution modes). So, I recommend to comment it out to get full results.

   - The local dir parametrized with 'hadoop.mapred.local.dir' in ant, or 'HADOOP_MAPRED_LOCAL_DIR' in environment.

   - Debug output parametrized with 'e2e.debug' in ant, or 'E2E_DEBUG' in environment.

   - I tested the patch on 4-CPU Linux workstation, trunk branch. 
          In parallel local mode (3*3) the all e2e tests run took 91 min, and gave the following results: Final results ,    PASSED: 555  FAILED: 2    SKIPPED: 65   ABORTED: 15   FAILED DEPENDENCY: 0
          In sequential local mode the tests are still running, but previous run that (hung up) was >180 min and gave similar pass/fail results. I'll update the info when the run finish.


Description
-------

Please see https://issues.apache.org/jira/browse/PIG-2898 for details.


This addresses bug https://issues.apache.org/jira/browse/PIG-2898.
    https://issues.apache.org/jira/browse/https://issues.apache.org/jira/browse/PIG-2898


Diffs (updated)
-----

  http://svn.apache.org/repos/asf/pig/trunk/test/e2e/harness/TestDriver.pm 1390615 
  http://svn.apache.org/repos/asf/pig/trunk/test/e2e/harness/test_harness.pl 1390615 
  http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/build.xml 1390615 
  http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/conf/local.conf 1390615 
  http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/deployers/ExistingClusterDeployer.pm 1390615 
  http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/drivers/TestDriverPig.pm 1390615 
  http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/tests/streaming_local.conf 1390615 

Diff: https://reviews.apache.org/r/7053/diff/


Testing
-------

Tested e2e tests execution in both sequential (default) and parallel modes.
The test run duration measurement data (in dependency on the fork parameters) will be available soon.


Thanks,

Ivan Veselovsky


Re: Review Request: PIG-2898: allow to run pig e2e tests in parallel mode.

Posted by Ivan Veselovsky <iv...@griddynamics.com>.

> On Sept. 20, 2012, 9:06 p.m., Rohini Palaniswamy wrote:
> > http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/drivers/TestDriverPig.pm, lines 136-154
> > <https://reviews.apache.org/r/7053/diff/1/?file=152857#file152857line136>
> >
> >     This was taking 40 secs for each group in parallel mode as opposed to 6 secs in sequential mode and I am suspecting it is due to lot of forking. So running two tests with tests.to.run option took double the time of sequential (5 min to 12 min). That cost is immaterial if we are running the whole suite. But reducing it helps as most of the time you will be running a subset for testing. 
> >     
> >      Would be good to optimize by combining the two hdfs mkdirs into one command and use perl mkdir instead of IPC::Run (which internally does fork and 1 sec sleep).

The results of fully sequential run under the same conditions:
     [exec] Final results ,    PASSED: 555  FAILED: 2    SKIPPED: 65   ABORTED: 15   FAILED DEPENDENCY: 0 
That is, the results are the same that in the paralellized mode.)
The run took 250 minutes, that is, 2.7 times larger than the parallelized one.


- Ivan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7053/#review11749
-----------------------------------------------------------


On Sept. 28, 2012, 12:20 p.m., Ivan Veselovsky wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/7053/
> -----------------------------------------------------------
> 
> (Updated Sept. 28, 2012, 12:20 p.m.)
> 
> 
> Review request for pig and Rohini Palaniswamy.
> 
> 
> Description
> -------
> 
> Please see https://issues.apache.org/jira/browse/PIG-2898 for details.
> 
> 
> This addresses bug https://issues.apache.org/jira/browse/PIG-2898.
>     https://issues.apache.org/jira/browse/https://issues.apache.org/jira/browse/PIG-2898
> 
> 
> Diffs
> -----
> 
>   http://svn.apache.org/repos/asf/pig/trunk/test/e2e/harness/TestDriver.pm 1390615 
>   http://svn.apache.org/repos/asf/pig/trunk/test/e2e/harness/test_harness.pl 1390615 
>   http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/build.xml 1390615 
>   http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/conf/local.conf 1390615 
>   http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/deployers/ExistingClusterDeployer.pm 1390615 
>   http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/drivers/TestDriverPig.pm 1390615 
>   http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/tests/streaming_local.conf 1390615 
> 
> Diff: https://reviews.apache.org/r/7053/diff/
> 
> 
> Testing
> -------
> 
> Tested e2e tests execution in both sequential (default) and parallel modes.
> The test run duration measurement data (in dependency on the fork parameters) will be available soon.
> 
> 
> Thanks,
> 
> Ivan Veselovsky
> 
>


Re: Review Request: PIG-2898: allow to run pig e2e tests in parallel mode.

Posted by Rohini Palaniswamy <ro...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7053/#review11749
-----------------------------------------------------------


General comments:
1) It would be good to rename file.fork.factor to fork.factor.conf.file and fork.factor to fork.factor.group for easy understanding of the intention of the parameters.

2) 	[exec] Results so far, PASSED: 1 FAILED: 0 SKIPPED: 0 ABORTED: 0 FAILED DEPENDENCY: 0
 - It would be good to have another context[] before Results, which puts the forkname like [nightly.conf-CoGroup] or [nightly.conf] if it is a parallel execution. Without it reading results is difficult.

3) dbg and dumpHash
  There are lot of commented dbg and dumpHash calls which need to be uncommented manually for debug purposes. Could we have these printed based on a -debug option or -De2e.debug=true system property instead of having commented code and rename dumpHash to dbgDumpHash. 

Issues:
  - Running test in local mode hangs and does not work . Needs to be fixed. 
  - When running only two tests the time taken was twice sequential mode because of the mkdir/rmr. That could be improved. More details in the code section.

Performance:
   - Performance improvement is really good when I tested it. Test for mapred mode was down to 2.5 hrs from 9 hrs with benchmark results already cached in our setup.  




http://svn.apache.org/repos/asf/pig/trunk/test/e2e/harness/TestDriver.pm
<https://reviews.apache.org/r/7053/#comment25316>

    Context of the fork here



http://svn.apache.org/repos/asf/pig/trunk/test/e2e/harness/TestDriver.pm
<https://reviews.apache.org/r/7053/#comment25315>

    Context of the fork here.



http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/drivers/TestDriverPig.pm
<https://reviews.apache.org/r/7053/#comment25314>

    This was taking 40 secs for each group in parallel mode as opposed to 6 secs in sequential mode and I am suspecting it is due to lot of forking. So running two tests with tests.to.run option took double the time of sequential (5 min to 12 min). That cost is immaterial if we are running the whole suite. But reducing it helps as most of the time you will be running a subset for testing. 
    
     Would be good to optimize by combining the two hdfs mkdirs into one command and use perl mkdir instead of IPC::Run (which internally does fork and 1 sec sleep).  


- Rohini Palaniswamy


On Sept. 12, 2012, 8:31 a.m., Ivan Veselovsky wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/7053/
> -----------------------------------------------------------
> 
> (Updated Sept. 12, 2012, 8:31 a.m.)
> 
> 
> Review request for pig and Rohini Palaniswamy.
> 
> 
> Description
> -------
> 
> Please see https://issues.apache.org/jira/browse/PIG-2898 for details.
> 
> 
> This addresses bug https://issues.apache.org/jira/browse/PIG-2898.
>     https://issues.apache.org/jira/browse/https://issues.apache.org/jira/browse/PIG-2898
> 
> 
> Diffs
> -----
> 
>   http://svn.apache.org/repos/asf/pig/trunk/test/e2e/harness/TestDriver.pm 1383357 
>   http://svn.apache.org/repos/asf/pig/trunk/test/e2e/harness/test_harness.pl 1383357 
>   http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/build.xml 1383357 
>   http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/deployers/ExistingClusterDeployer.pm 1383357 
>   http://svn.apache.org/repos/asf/pig/trunk/test/e2e/pig/drivers/TestDriverPig.pm 1383357 
> 
> Diff: https://reviews.apache.org/r/7053/diff/
> 
> 
> Testing
> -------
> 
> Tested e2e tests execution in both sequential (default) and parallel modes.
> The test run duration measurement data (in dependency on the fork parameters) will be available soon.
> 
> 
> Thanks,
> 
> Ivan Veselovsky
> 
>