You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@crunch.apache.org by "Matthias Friedrich (JIRA)" <ji...@apache.org> on 2012/07/22 09:22:35 UTC

[jira] [Created] (CRUNCH-24) Make test suite suitable for continuous integration

Matthias Friedrich created CRUNCH-24:
----------------------------------------

             Summary: Make test suite suitable for continuous integration
                 Key: CRUNCH-24
                 URL: https://issues.apache.org/jira/browse/CRUNCH-24
             Project: Crunch
          Issue Type: Task
    Affects Versions: 0.3.0
            Reporter: Matthias Friedrich
            Assignee: Matthias Friedrich
             Fix For: 0.3.0


Right now the integration test suite leaves about 80 files behind on /tmp making it unsuitable for a shared continuous integration environment. Examples for these files are test case output ("output*") and Crunch's own temporary files ("crunch*", see CRUNCH-21).

We have to delete these files or make sure they aren't created on /tmp in the first place.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (CRUNCH-24) Make test suite suitable for continuous integration

Posted by "Josh Wills (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CRUNCH-24?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Josh Wills resolved CRUNCH-24.
------------------------------

    Resolution: Fixed
    
> Make test suite suitable for continuous integration
> ---------------------------------------------------
>
>                 Key: CRUNCH-24
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-24
>             Project: Crunch
>          Issue Type: Task
>    Affects Versions: 0.3.0
>            Reporter: Matthias Friedrich
>            Assignee: Matthias Friedrich
>             Fix For: 0.3.0
>
>         Attachments: 0001-CRUNCH-24-Clean-up-test-suite-output.patch, 0001-CRUNCH-24-Distributed-cache.patch, 0001-CRUNCH-24-make-testsuite-sutable-for-CI.patch, 0001-CRUNCH-24-make-testsuite-sutable-for-CI.patch, CRUNCH-24-Cumulative-1.patch, CRUNCH-24-gabriel.patch, CRUNCH-24-josh.patch
>
>
> Right now the integration test suite leaves about 80 files behind on /tmp making it unsuitable for a shared continuous integration environment. Examples for these files are test case output ("output*") and Crunch's own temporary files ("crunch*", see CRUNCH-21).
> We have to delete these files or make sure they aren't created on /tmp in the first place.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CRUNCH-24) Make test suite suitable for continuous integration

Posted by "Josh Wills (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CRUNCH-24?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13420951#comment-13420951 ] 

Josh Wills commented on CRUNCH-24:
----------------------------------

If I remove the Configuration argument, it works. The exception is mainly what you would expect-- the CrunchMapper cannot find the file that it tried to store in the DistributedCache:

6526 [Thread-26] INFO  org.apache.crunch.impl.mr.exec.CrunchJob  - Job status available at: http://localhost:8080/
6564 [Thread-46] WARN  org.apache.hadoop.mapred.LocalJobRunner  - job_local_0002
org.apache.crunch.impl.mr.run.CrunchRuntimeException: Error reading right-side of map side join: 
	at org.apache.crunch.lib.join.MapsideJoin$MapsideJoinDoFn.initialize(MapsideJoin.java:134)
	at org.apache.crunch.DoFn.setContext(DoFn.java:100)
	at org.apache.crunch.impl.mr.run.RTNode.initialize(RTNode.java:61)
	at org.apache.crunch.impl.mr.run.RTNode.initialize(RTNode.java:63)
	at org.apache.crunch.impl.mr.run.RTNode.initialize(RTNode.java:63)
	at org.apache.crunch.impl.mr.run.CrunchTaskContext.getNodes(CrunchTaskContext.java:56)
	at org.apache.crunch.impl.mr.run.CrunchMapper.setup(CrunchMapper.java:40)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:210)
Caused by: java.io.IOException: No files found to materialize at: file:/tmp/hadoop-josh/mapred/local/archive/313161001601735375_-1266581102_898898625/file/var/folders/ga/gauyLIkgEiuMA30hW9H3YU+++TI/-Tmp-/junit7903276863692299885/junit3527640565817378898/crunch-1754643543/p1
	at org.apache.crunch.io.CompositePathIterable.create(CompositePathIterable.java:48)
	at org.apache.crunch.io.seq.SeqFileTableSource.read(SeqFileTableSource.java:48)
	at org.apache.crunch.io.impl.ReadableSourcePathTargetImpl.read(ReadableSourcePathTargetImpl.java:35)
	at org.apache.crunch.lib.join.MapsideJoin$MapsideJoinDoFn.initialize(MapsideJoin.java:132)
	... 10 more
                
> Make test suite suitable for continuous integration
> ---------------------------------------------------
>
>                 Key: CRUNCH-24
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-24
>             Project: Crunch
>          Issue Type: Task
>    Affects Versions: 0.3.0
>            Reporter: Matthias Friedrich
>            Assignee: Matthias Friedrich
>             Fix For: 0.3.0
>
>         Attachments: 0001-CRUNCH-24-Clean-up-test-suite-output.patch, 0001-CRUNCH-24-make-testsuite-sutable-for-CI.patch, 0001-CRUNCH-24-make-testsuite-sutable-for-CI.patch
>
>
> Right now the integration test suite leaves about 80 files behind on /tmp making it unsuitable for a shared continuous integration environment. Examples for these files are test case output ("output*") and Crunch's own temporary files ("crunch*", see CRUNCH-21).
> We have to delete these files or make sure they aren't created on /tmp in the first place.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CRUNCH-24) Make test suite suitable for continuous integration

Posted by "Matthias Friedrich (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CRUNCH-24?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13420813#comment-13420813 ] 

Matthias Friedrich commented on CRUNCH-24:
------------------------------------------

Weird. I applied my first patch and Rahul's second patch against a freshly cloned repo. Apart from whitespace warnings (my bad) they applied cleanly and "mvn verify" succeeded. Did "mvn verify" work for you with just the first patch?
                
> Make test suite suitable for continuous integration
> ---------------------------------------------------
>
>                 Key: CRUNCH-24
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-24
>             Project: Crunch
>          Issue Type: Task
>    Affects Versions: 0.3.0
>            Reporter: Matthias Friedrich
>            Assignee: Matthias Friedrich
>             Fix For: 0.3.0
>
>         Attachments: 0001-CRUNCH-24-Clean-up-test-suite-output.patch, 0001-CRUNCH-24-make-testsuite-sutable-for-CI.patch, 0001-CRUNCH-24-make-testsuite-sutable-for-CI.patch
>
>
> Right now the integration test suite leaves about 80 files behind on /tmp making it unsuitable for a shared continuous integration environment. Examples for these files are test case output ("output*") and Crunch's own temporary files ("crunch*", see CRUNCH-21).
> We have to delete these files or make sure they aren't created on /tmp in the first place.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CRUNCH-24) Make test suite suitable for continuous integration

Posted by "Josh Wills (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CRUNCH-24?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13420777#comment-13420777 ] 

Josh Wills commented on CRUNCH-24:
----------------------------------

Hey Rahul-- thanks for the patch. When I applied it, I got a couple of failures in the MapsideJoinIT-- could you check that one locally and see if it fails for you as well?
                
> Make test suite suitable for continuous integration
> ---------------------------------------------------
>
>                 Key: CRUNCH-24
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-24
>             Project: Crunch
>          Issue Type: Task
>    Affects Versions: 0.3.0
>            Reporter: Matthias Friedrich
>            Assignee: Matthias Friedrich
>             Fix For: 0.3.0
>
>         Attachments: 0001-CRUNCH-24-Clean-up-test-suite-output.patch, 0001-CRUNCH-24-make-testsuite-sutable-for-CI.patch, 0001-CRUNCH-24-make-testsuite-sutable-for-CI.patch
>
>
> Right now the integration test suite leaves about 80 files behind on /tmp making it unsuitable for a shared continuous integration environment. Examples for these files are test case output ("output*") and Crunch's own temporary files ("crunch*", see CRUNCH-21).
> We have to delete these files or make sure they aren't created on /tmp in the first place.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CRUNCH-24) Make test suite suitable for continuous integration

Posted by "Matthias Friedrich (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CRUNCH-24?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13420867#comment-13420867 ] 

Matthias Friedrich commented on CRUNCH-24:
------------------------------------------

There are only two changes in MapsideJoinIT, both redirect crunch.tmp.dir to a temporary directory (/tmp/junitRANDOM). If you run this test in isolation, does it work? If you remove the Configuration argument from MRPipeline, does it work then? And what are the error messages?

I'll run this on a second machine tomorrow to get more data points :)
                
> Make test suite suitable for continuous integration
> ---------------------------------------------------
>
>                 Key: CRUNCH-24
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-24
>             Project: Crunch
>          Issue Type: Task
>    Affects Versions: 0.3.0
>            Reporter: Matthias Friedrich
>            Assignee: Matthias Friedrich
>             Fix For: 0.3.0
>
>         Attachments: 0001-CRUNCH-24-Clean-up-test-suite-output.patch, 0001-CRUNCH-24-make-testsuite-sutable-for-CI.patch, 0001-CRUNCH-24-make-testsuite-sutable-for-CI.patch
>
>
> Right now the integration test suite leaves about 80 files behind on /tmp making it unsuitable for a shared continuous integration environment. Examples for these files are test case output ("output*") and Crunch's own temporary files ("crunch*", see CRUNCH-21).
> We have to delete these files or make sure they aren't created on /tmp in the first place.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CRUNCH-24) Make test suite suitable for continuous integration

Posted by "Gabriel Reid (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CRUNCH-24?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421241#comment-13421241 ] 

Gabriel Reid commented on CRUNCH-24:
------------------------------------

Josh, your changes to MapsideJoin look like they shouldn't break anything, but it also seems very odd that they would be necessary. I just tried running the tests on my Linux machine and it runs without any problems. If I've understood it correctly, you're only running into this on OS X? If you guys don't mind waiting, I'll take a look at this on my OS X machine later this evening (I've only got OS X at home) to see if I can find what the underlying problem is on OS X.
                
> Make test suite suitable for continuous integration
> ---------------------------------------------------
>
>                 Key: CRUNCH-24
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-24
>             Project: Crunch
>          Issue Type: Task
>    Affects Versions: 0.3.0
>            Reporter: Matthias Friedrich
>            Assignee: Matthias Friedrich
>             Fix For: 0.3.0
>
>         Attachments: 0001-CRUNCH-24-Clean-up-test-suite-output.patch, 0001-CRUNCH-24-make-testsuite-sutable-for-CI.patch, 0001-CRUNCH-24-make-testsuite-sutable-for-CI.patch, CRUNCH-24-josh.patch
>
>
> Right now the integration test suite leaves about 80 files behind on /tmp making it unsuitable for a shared continuous integration environment. Examples for these files are test case output ("output*") and Crunch's own temporary files ("crunch*", see CRUNCH-21).
> We have to delete these files or make sure they aren't created on /tmp in the first place.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CRUNCH-24) Make test suite suitable for continuous integration

Posted by "Matthias Friedrich (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CRUNCH-24?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Matthias Friedrich updated CRUNCH-24:
-------------------------------------

    Attachment: CRUNCH-24-Cumulative-1.patch

The crunch module is now clean (see patch CRUNCH-24-Cumulative-1.patch) except for WordCountHBaseIT which still creates an empty directory tree below /tmp/hadoop-${USER} that I couldn't get rid of. This isn't serious though because it's always the same name so files won't accumulate.

Next thing is scrunch. I don't know Scala so I'm not the best person to do this. It shouldn't be as much as for crunch, all you have to do is track down calls to Files.createTempDir() and FileHelper.createOutputPath(). There are calls to deleteOnExit() but that never works for directories. Redirecting "hadoop.tmp.dir"  is necessary, too, I guess.

Thanks a lot for your help guys!
                
> Make test suite suitable for continuous integration
> ---------------------------------------------------
>
>                 Key: CRUNCH-24
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-24
>             Project: Crunch
>          Issue Type: Task
>    Affects Versions: 0.3.0
>            Reporter: Matthias Friedrich
>            Assignee: Matthias Friedrich
>             Fix For: 0.3.0
>
>         Attachments: 0001-CRUNCH-24-Clean-up-test-suite-output.patch, 0001-CRUNCH-24-Distributed-cache.patch, 0001-CRUNCH-24-make-testsuite-sutable-for-CI.patch, 0001-CRUNCH-24-make-testsuite-sutable-for-CI.patch, CRUNCH-24-Cumulative-1.patch, CRUNCH-24-gabriel.patch, CRUNCH-24-josh.patch
>
>
> Right now the integration test suite leaves about 80 files behind on /tmp making it unsuitable for a shared continuous integration environment. Examples for these files are test case output ("output*") and Crunch's own temporary files ("crunch*", see CRUNCH-21).
> We have to delete these files or make sure they aren't created on /tmp in the first place.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CRUNCH-24) Make test suite suitable for continuous integration

Posted by "Josh Wills (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CRUNCH-24?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13420834#comment-13420834 ] 

Josh Wills commented on CRUNCH-24:
----------------------------------

I ran mvn clean install, running mvn verify now.
                
> Make test suite suitable for continuous integration
> ---------------------------------------------------
>
>                 Key: CRUNCH-24
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-24
>             Project: Crunch
>          Issue Type: Task
>    Affects Versions: 0.3.0
>            Reporter: Matthias Friedrich
>            Assignee: Matthias Friedrich
>             Fix For: 0.3.0
>
>         Attachments: 0001-CRUNCH-24-Clean-up-test-suite-output.patch, 0001-CRUNCH-24-make-testsuite-sutable-for-CI.patch, 0001-CRUNCH-24-make-testsuite-sutable-for-CI.patch
>
>
> Right now the integration test suite leaves about 80 files behind on /tmp making it unsuitable for a shared continuous integration environment. Examples for these files are test case output ("output*") and Crunch's own temporary files ("crunch*", see CRUNCH-21).
> We have to delete these files or make sure they aren't created on /tmp in the first place.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CRUNCH-24) Make test suite suitable for continuous integration

Posted by "Matthias Friedrich (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CRUNCH-24?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421811#comment-13421811 ] 

Matthias Friedrich commented on CRUNCH-24:
------------------------------------------

I managed to clean up a few more mostly by setting "hadoop.tmp.dir" to get rid of "/tmp/hadoop-${USER}/". But I'm too smart to post patches this late at night ;-)

What's left: One in WordCountHBaseIT that I couldn't prevent and the scrunch module.
                
> Make test suite suitable for continuous integration
> ---------------------------------------------------
>
>                 Key: CRUNCH-24
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-24
>             Project: Crunch
>          Issue Type: Task
>    Affects Versions: 0.3.0
>            Reporter: Matthias Friedrich
>            Assignee: Matthias Friedrich
>             Fix For: 0.3.0
>
>         Attachments: 0001-CRUNCH-24-Clean-up-test-suite-output.patch, 0001-CRUNCH-24-Distributed-cache.patch, 0001-CRUNCH-24-make-testsuite-sutable-for-CI.patch, 0001-CRUNCH-24-make-testsuite-sutable-for-CI.patch, CRUNCH-24-gabriel.patch, CRUNCH-24-josh.patch
>
>
> Right now the integration test suite leaves about 80 files behind on /tmp making it unsuitable for a shared continuous integration environment. Examples for these files are test case output ("output*") and Crunch's own temporary files ("crunch*", see CRUNCH-21).
> We have to delete these files or make sure they aren't created on /tmp in the first place.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CRUNCH-24) Make test suite suitable for continuous integration

Posted by "Matthias Friedrich (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CRUNCH-24?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13420225#comment-13420225 ] 

Matthias Friedrich commented on CRUNCH-24:
------------------------------------------

Looks great, Rahul, thank you! I think we covered most of them; there is just one last test to tame. UnionCollectionIT is a tricky beast because it uses JUnit's @Parameter mechanism. Do you want to give it a try?
                
> Make test suite suitable for continuous integration
> ---------------------------------------------------
>
>                 Key: CRUNCH-24
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-24
>             Project: Crunch
>          Issue Type: Task
>    Affects Versions: 0.3.0
>            Reporter: Matthias Friedrich
>            Assignee: Matthias Friedrich
>             Fix For: 0.3.0
>
>         Attachments: 0001-CRUNCH-24-Clean-up-test-suite-output.patch, 0001-CRUNCH-24-make-testsuite-sutable-for-CI.patch
>
>
> Right now the integration test suite leaves about 80 files behind on /tmp making it unsuitable for a shared continuous integration environment. Examples for these files are test case output ("output*") and Crunch's own temporary files ("crunch*", see CRUNCH-21).
> We have to delete these files or make sure they aren't created on /tmp in the first place.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CRUNCH-24) Make test suite suitable for continuous integration

Posted by "Josh Wills (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CRUNCH-24?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Josh Wills updated CRUNCH-24:
-----------------------------

    Attachment: CRUNCH-24-josh.patch

I got the tests working on my OS X instance w/the following hack to the initialization code for the map-side join. I'm going to cc Gabriel to take a look at it, it was odd that I had to do it, but I don't think it's that harmful.
                
> Make test suite suitable for continuous integration
> ---------------------------------------------------
>
>                 Key: CRUNCH-24
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-24
>             Project: Crunch
>          Issue Type: Task
>    Affects Versions: 0.3.0
>            Reporter: Matthias Friedrich
>            Assignee: Matthias Friedrich
>             Fix For: 0.3.0
>
>         Attachments: 0001-CRUNCH-24-Clean-up-test-suite-output.patch, 0001-CRUNCH-24-make-testsuite-sutable-for-CI.patch, 0001-CRUNCH-24-make-testsuite-sutable-for-CI.patch, CRUNCH-24-josh.patch
>
>
> Right now the integration test suite leaves about 80 files behind on /tmp making it unsuitable for a shared continuous integration environment. Examples for these files are test case output ("output*") and Crunch's own temporary files ("crunch*", see CRUNCH-21).
> We have to delete these files or make sure they aren't created on /tmp in the first place.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Re: [jira] [Updated] (CRUNCH-24) Make test suite suitable for continuous integration

Posted by Rahul <rs...@xebia.com>.
Hi Matthias,

+1 for the second idea. I was thinking of creating a method there that 
can set the required property in Configuration.
Then using TemporaryPath as @Rule component.

- Rahul

On 22-07-2012 16:35, Matthias Friedrich wrote:
> Hi Rahul,
>
> that would be really great :)
>
> My idea of solving this was to use the new self-cleaning TemporaryPath
> JUnit @Rule I added in the first patch and set "crunch.tmp.dir" to its
> root directory. Basically, this has to be done with each instantiation
> of MRPipeline (see MRPipelineTest for an example), so perhaps you can
> move something to a global utility method. Another idea would be to
> add a createConfig() method to TemporaryPath that sets "crunch.tmp.dir"
> for you.
>
> I tried assigning the issue to you but my only choices are myself and
> "Automatic".
>
> Thanks,
>    Matthias
>
> On Sunday, 2012-07-22, Rahul wrote:
>> I can fix the issue of setting up a crunch.temp.dir in integration tests
>>
>> On 22-07-2012 13:08, Matthias Friedrich (JIRA) wrote:
>>>       [ https://issues.apache.org/jira/browse/CRUNCH-24?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
>>>
>>> Matthias Friedrich updated CRUNCH-24:
>>> -------------------------------------
>>>
>>>      Attachment: 0001-CRUNCH-24-Clean-up-test-suite-output.patch
>>>
>>> First installment which fixes all test suite output. Next is setting crunch.tmp.dir to a temporary directory that is cleaned automatically, but I can't summon the strength to fix the rest right now. Turnaround times are horrible but there's hope (additional RAM is in the mail) :)
>>>> Make test suite suitable for continuous integration
>>>> ---------------------------------------------------
>>>>
>>>>                  Key: CRUNCH-24
>>>>                  URL: https://issues.apache.org/jira/browse/CRUNCH-24
>>>>              Project: Crunch
>>>>           Issue Type: Task
>>>>     Affects Versions: 0.3.0
>>>>             Reporter: Matthias Friedrich
>>>>             Assignee: Matthias Friedrich
>>>>              Fix For: 0.3.0
>>>>
>>>>          Attachments: 0001-CRUNCH-24-Clean-up-test-suite-output.patch
>>>>
>>>>
>>>> Right now the integration test suite leaves about 80 files behind on /tmp making it unsuitable for a shared continuous integration environment. Examples for these files are test case output ("output*") and Crunch's own temporary files ("crunch*", see CRUNCH-21).
>>>> We have to delete these files or make sure they aren't created on /tmp in the first place.
>>> --
>>> This message is automatically generated by JIRA.
>>> If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
>>> For more information on JIRA, see: http://www.atlassian.com/software/jira
>>>


Re: [jira] [Updated] (CRUNCH-24) Make test suite suitable for continuous integration

Posted by Matthias Friedrich <ma...@mafr.de>.
Hi Rahul,

that would be really great :)

My idea of solving this was to use the new self-cleaning TemporaryPath
JUnit @Rule I added in the first patch and set "crunch.tmp.dir" to its
root directory. Basically, this has to be done with each instantiation
of MRPipeline (see MRPipelineTest for an example), so perhaps you can
move something to a global utility method. Another idea would be to
add a createConfig() method to TemporaryPath that sets "crunch.tmp.dir"
for you.

I tried assigning the issue to you but my only choices are myself and
"Automatic".

Thanks,
  Matthias

On Sunday, 2012-07-22, Rahul wrote:
> I can fix the issue of setting up a crunch.temp.dir in integration tests
> 
> On 22-07-2012 13:08, Matthias Friedrich (JIRA) wrote:
> >      [ https://issues.apache.org/jira/browse/CRUNCH-24?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
> >
> >Matthias Friedrich updated CRUNCH-24:
> >-------------------------------------
> >
> >     Attachment: 0001-CRUNCH-24-Clean-up-test-suite-output.patch
> >
> >First installment which fixes all test suite output. Next is setting crunch.tmp.dir to a temporary directory that is cleaned automatically, but I can't summon the strength to fix the rest right now. Turnaround times are horrible but there's hope (additional RAM is in the mail) :)
> >>Make test suite suitable for continuous integration
> >>---------------------------------------------------
> >>
> >>                 Key: CRUNCH-24
> >>                 URL: https://issues.apache.org/jira/browse/CRUNCH-24
> >>             Project: Crunch
> >>          Issue Type: Task
> >>    Affects Versions: 0.3.0
> >>            Reporter: Matthias Friedrich
> >>            Assignee: Matthias Friedrich
> >>             Fix For: 0.3.0
> >>
> >>         Attachments: 0001-CRUNCH-24-Clean-up-test-suite-output.patch
> >>
> >>
> >>Right now the integration test suite leaves about 80 files behind on /tmp making it unsuitable for a shared continuous integration environment. Examples for these files are test case output ("output*") and Crunch's own temporary files ("crunch*", see CRUNCH-21).
> >>We have to delete these files or make sure they aren't created on /tmp in the first place.
> >--
> >This message is automatically generated by JIRA.
> >If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
> >For more information on JIRA, see: http://www.atlassian.com/software/jira
> >
> 

Re: [jira] [Updated] (CRUNCH-24) Make test suite suitable for continuous integration

Posted by Rahul <rs...@xebia.com>.
I can fix the issue of setting up a crunch.temp.dir in integration tests

On 22-07-2012 13:08, Matthias Friedrich (JIRA) wrote:
>       [ https://issues.apache.org/jira/browse/CRUNCH-24?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
>
> Matthias Friedrich updated CRUNCH-24:
> -------------------------------------
>
>      Attachment: 0001-CRUNCH-24-Clean-up-test-suite-output.patch
>
> First installment which fixes all test suite output. Next is setting crunch.tmp.dir to a temporary directory that is cleaned automatically, but I can't summon the strength to fix the rest right now. Turnaround times are horrible but there's hope (additional RAM is in the mail) :)
>                  
>> Make test suite suitable for continuous integration
>> ---------------------------------------------------
>>
>>                  Key: CRUNCH-24
>>                  URL: https://issues.apache.org/jira/browse/CRUNCH-24
>>              Project: Crunch
>>           Issue Type: Task
>>     Affects Versions: 0.3.0
>>             Reporter: Matthias Friedrich
>>             Assignee: Matthias Friedrich
>>              Fix For: 0.3.0
>>
>>          Attachments: 0001-CRUNCH-24-Clean-up-test-suite-output.patch
>>
>>
>> Right now the integration test suite leaves about 80 files behind on /tmp making it unsuitable for a shared continuous integration environment. Examples for these files are test case output ("output*") and Crunch's own temporary files ("crunch*", see CRUNCH-21).
>> We have to delete these files or make sure they aren't created on /tmp in the first place.
> --
> This message is automatically generated by JIRA.
> If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
> For more information on JIRA, see: http://www.atlassian.com/software/jira
>
>          


[jira] [Updated] (CRUNCH-24) Make test suite suitable for continuous integration

Posted by "Matthias Friedrich (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CRUNCH-24?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Matthias Friedrich updated CRUNCH-24:
-------------------------------------

    Attachment: 0001-CRUNCH-24-Clean-up-test-suite-output.patch

First installment which fixes all test suite output. Next is setting crunch.tmp.dir to a temporary directory that is cleaned automatically, but I can't summon the strength to fix the rest right now. Turnaround times are horrible but there's hope (additional RAM is in the mail) :)
                
> Make test suite suitable for continuous integration
> ---------------------------------------------------
>
>                 Key: CRUNCH-24
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-24
>             Project: Crunch
>          Issue Type: Task
>    Affects Versions: 0.3.0
>            Reporter: Matthias Friedrich
>            Assignee: Matthias Friedrich
>             Fix For: 0.3.0
>
>         Attachments: 0001-CRUNCH-24-Clean-up-test-suite-output.patch
>
>
> Right now the integration test suite leaves about 80 files behind on /tmp making it unsuitable for a shared continuous integration environment. Examples for these files are test case output ("output*") and Crunch's own temporary files ("crunch*", see CRUNCH-21).
> We have to delete these files or make sure they aren't created on /tmp in the first place.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CRUNCH-24) Make test suite suitable for continuous integration

Posted by "Gabriel Reid (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CRUNCH-24?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gabriel Reid updated CRUNCH-24:
-------------------------------

    Attachment: CRUNCH-24-gabriel.patch

Altered version of patch CRUNCH-24-josh.patch to alter the setup of MapsideJoinIT to allow it to run on OS X, and undoing the changes made to MapsideJoin.
                
> Make test suite suitable for continuous integration
> ---------------------------------------------------
>
>                 Key: CRUNCH-24
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-24
>             Project: Crunch
>          Issue Type: Task
>    Affects Versions: 0.3.0
>            Reporter: Matthias Friedrich
>            Assignee: Matthias Friedrich
>             Fix For: 0.3.0
>
>         Attachments: 0001-CRUNCH-24-Clean-up-test-suite-output.patch, 0001-CRUNCH-24-Distributed-cache.patch, 0001-CRUNCH-24-make-testsuite-sutable-for-CI.patch, 0001-CRUNCH-24-make-testsuite-sutable-for-CI.patch, CRUNCH-24-gabriel.patch, CRUNCH-24-josh.patch
>
>
> Right now the integration test suite leaves about 80 files behind on /tmp making it unsuitable for a shared continuous integration environment. Examples for these files are test case output ("output*") and Crunch's own temporary files ("crunch*", see CRUNCH-21).
> We have to delete these files or make sure they aren't created on /tmp in the first place.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CRUNCH-24) Make test suite suitable for continuous integration

Posted by "Rahul Sharma (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CRUNCH-24?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rahul Sharma updated CRUNCH-24:
-------------------------------

    Attachment: 0001-CRUNCH-24-make-testsuite-sutable-for-CI.patch

Modified all integration tests to set crunch.tmp.dir using TemporaryPath.
                
> Make test suite suitable for continuous integration
> ---------------------------------------------------
>
>                 Key: CRUNCH-24
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-24
>             Project: Crunch
>          Issue Type: Task
>    Affects Versions: 0.3.0
>            Reporter: Matthias Friedrich
>            Assignee: Matthias Friedrich
>             Fix For: 0.3.0
>
>         Attachments: 0001-CRUNCH-24-Clean-up-test-suite-output.patch, 0001-CRUNCH-24-make-testsuite-sutable-for-CI.patch
>
>
> Right now the integration test suite leaves about 80 files behind on /tmp making it unsuitable for a shared continuous integration environment. Examples for these files are test case output ("output*") and Crunch's own temporary files ("crunch*", see CRUNCH-21).
> We have to delete these files or make sure they aren't created on /tmp in the first place.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CRUNCH-24) Make test suite suitable for continuous integration

Posted by "Rahul Sharma (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CRUNCH-24?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rahul Sharma updated CRUNCH-24:
-------------------------------

    Attachment: 0001-CRUNCH-24-Distributed-cache.patch

Josh, I have made changes to MapsideJoin to copy the file to cache. Can you check if it works ?
                
> Make test suite suitable for continuous integration
> ---------------------------------------------------
>
>                 Key: CRUNCH-24
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-24
>             Project: Crunch
>          Issue Type: Task
>    Affects Versions: 0.3.0
>            Reporter: Matthias Friedrich
>            Assignee: Matthias Friedrich
>             Fix For: 0.3.0
>
>         Attachments: 0001-CRUNCH-24-Clean-up-test-suite-output.patch, 0001-CRUNCH-24-Distributed-cache.patch, 0001-CRUNCH-24-make-testsuite-sutable-for-CI.patch, 0001-CRUNCH-24-make-testsuite-sutable-for-CI.patch, CRUNCH-24-josh.patch
>
>
> Right now the integration test suite leaves about 80 files behind on /tmp making it unsuitable for a shared continuous integration environment. Examples for these files are test case output ("output*") and Crunch's own temporary files ("crunch*", see CRUNCH-21).
> We have to delete these files or make sure they aren't created on /tmp in the first place.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CRUNCH-24) Make test suite suitable for continuous integration

Posted by "Matthias Friedrich (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CRUNCH-24?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421261#comment-13421261 ] 

Matthias Friedrich commented on CRUNCH-24:
------------------------------------------

My tests on various Linux machines all worked, too. I don't have OS X available to help with testing, but we have other options like writing to "target/" instead of "/tmp" if this doesn't work out.
                
> Make test suite suitable for continuous integration
> ---------------------------------------------------
>
>                 Key: CRUNCH-24
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-24
>             Project: Crunch
>          Issue Type: Task
>    Affects Versions: 0.3.0
>            Reporter: Matthias Friedrich
>            Assignee: Matthias Friedrich
>             Fix For: 0.3.0
>
>         Attachments: 0001-CRUNCH-24-Clean-up-test-suite-output.patch, 0001-CRUNCH-24-make-testsuite-sutable-for-CI.patch, 0001-CRUNCH-24-make-testsuite-sutable-for-CI.patch, CRUNCH-24-josh.patch
>
>
> Right now the integration test suite leaves about 80 files behind on /tmp making it unsuitable for a shared continuous integration environment. Examples for these files are test case output ("output*") and Crunch's own temporary files ("crunch*", see CRUNCH-21).
> We have to delete these files or make sure they aren't created on /tmp in the first place.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CRUNCH-24) Make test suite suitable for continuous integration

Posted by "Josh Wills (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CRUNCH-24?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13422906#comment-13422906 ] 

Josh Wills commented on CRUNCH-24:
----------------------------------

Thank you Matthias-- I think I'll commit this one as-is and then create a separate JIRA for the Scrunch fixes.
                
> Make test suite suitable for continuous integration
> ---------------------------------------------------
>
>                 Key: CRUNCH-24
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-24
>             Project: Crunch
>          Issue Type: Task
>    Affects Versions: 0.3.0
>            Reporter: Matthias Friedrich
>            Assignee: Matthias Friedrich
>             Fix For: 0.3.0
>
>         Attachments: 0001-CRUNCH-24-Clean-up-test-suite-output.patch, 0001-CRUNCH-24-Distributed-cache.patch, 0001-CRUNCH-24-make-testsuite-sutable-for-CI.patch, 0001-CRUNCH-24-make-testsuite-sutable-for-CI.patch, CRUNCH-24-Cumulative-1.patch, CRUNCH-24-gabriel.patch, CRUNCH-24-josh.patch
>
>
> Right now the integration test suite leaves about 80 files behind on /tmp making it unsuitable for a shared continuous integration environment. Examples for these files are test case output ("output*") and Crunch's own temporary files ("crunch*", see CRUNCH-21).
> We have to delete these files or make sure they aren't created on /tmp in the first place.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CRUNCH-24) Make test suite suitable for continuous integration

Posted by "Rahul Sharma (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CRUNCH-24?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421164#comment-13421164 ] 

Rahul Sharma commented on CRUNCH-24:
------------------------------------

Josh, I check my patch again, it works. I cloned the repository again and then applied the required patches. They all worked fine. 
I got a similar looking error while working on CRUNCH-4 (profiles issue), and made a fix in that patch. But this one did not gave such a issue. 
                
> Make test suite suitable for continuous integration
> ---------------------------------------------------
>
>                 Key: CRUNCH-24
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-24
>             Project: Crunch
>          Issue Type: Task
>    Affects Versions: 0.3.0
>            Reporter: Matthias Friedrich
>            Assignee: Matthias Friedrich
>             Fix For: 0.3.0
>
>         Attachments: 0001-CRUNCH-24-Clean-up-test-suite-output.patch, 0001-CRUNCH-24-make-testsuite-sutable-for-CI.patch, 0001-CRUNCH-24-make-testsuite-sutable-for-CI.patch
>
>
> Right now the integration test suite leaves about 80 files behind on /tmp making it unsuitable for a shared continuous integration environment. Examples for these files are test case output ("output*") and Crunch's own temporary files ("crunch*", see CRUNCH-21).
> We have to delete these files or make sure they aren't created on /tmp in the first place.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CRUNCH-24) Make test suite suitable for continuous integration

Posted by "Rahul Sharma (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CRUNCH-24?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rahul Sharma updated CRUNCH-24:
-------------------------------

    Attachment: 0001-CRUNCH-24-make-testsuite-sutable-for-CI.patch

Updated patch with fixes to the UnionCollectionIT.
                
> Make test suite suitable for continuous integration
> ---------------------------------------------------
>
>                 Key: CRUNCH-24
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-24
>             Project: Crunch
>          Issue Type: Task
>    Affects Versions: 0.3.0
>            Reporter: Matthias Friedrich
>            Assignee: Matthias Friedrich
>             Fix For: 0.3.0
>
>         Attachments: 0001-CRUNCH-24-Clean-up-test-suite-output.patch, 0001-CRUNCH-24-make-testsuite-sutable-for-CI.patch, 0001-CRUNCH-24-make-testsuite-sutable-for-CI.patch
>
>
> Right now the integration test suite leaves about 80 files behind on /tmp making it unsuitable for a shared continuous integration environment. Examples for these files are test case output ("output*") and Crunch's own temporary files ("crunch*", see CRUNCH-21).
> We have to delete these files or make sure they aren't created on /tmp in the first place.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CRUNCH-24) Make test suite suitable for continuous integration

Posted by "Matthias Friedrich (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CRUNCH-24?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13420127#comment-13420127 ] 

Matthias Friedrich commented on CRUNCH-24:
------------------------------------------

Maven-scala-plugin doesn't clean up after itself. I filed a bug report for this (https://github.com/davidB/scala-maven-plugin/issues/96) but I don't think this will be fixed soon.
                
> Make test suite suitable for continuous integration
> ---------------------------------------------------
>
>                 Key: CRUNCH-24
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-24
>             Project: Crunch
>          Issue Type: Task
>    Affects Versions: 0.3.0
>            Reporter: Matthias Friedrich
>            Assignee: Matthias Friedrich
>             Fix For: 0.3.0
>
>         Attachments: 0001-CRUNCH-24-Clean-up-test-suite-output.patch
>
>
> Right now the integration test suite leaves about 80 files behind on /tmp making it unsuitable for a shared continuous integration environment. Examples for these files are test case output ("output*") and Crunch's own temporary files ("crunch*", see CRUNCH-21).
> We have to delete these files or make sure they aren't created on /tmp in the first place.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CRUNCH-24) Make test suite suitable for continuous integration

Posted by "Josh Wills (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CRUNCH-24?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421973#comment-13421973 ] 

Josh Wills commented on CRUNCH-24:
----------------------------------

+1, Gabriel's patch works for me.
                
> Make test suite suitable for continuous integration
> ---------------------------------------------------
>
>                 Key: CRUNCH-24
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-24
>             Project: Crunch
>          Issue Type: Task
>    Affects Versions: 0.3.0
>            Reporter: Matthias Friedrich
>            Assignee: Matthias Friedrich
>             Fix For: 0.3.0
>
>         Attachments: 0001-CRUNCH-24-Clean-up-test-suite-output.patch, 0001-CRUNCH-24-Distributed-cache.patch, 0001-CRUNCH-24-make-testsuite-sutable-for-CI.patch, 0001-CRUNCH-24-make-testsuite-sutable-for-CI.patch, CRUNCH-24-gabriel.patch, CRUNCH-24-josh.patch
>
>
> Right now the integration test suite leaves about 80 files behind on /tmp making it unsuitable for a shared continuous integration environment. Examples for these files are test case output ("output*") and Crunch's own temporary files ("crunch*", see CRUNCH-21).
> We have to delete these files or make sure they aren't created on /tmp in the first place.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CRUNCH-24) Make test suite suitable for continuous integration

Posted by "Gabriel Reid (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CRUNCH-24?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421709#comment-13421709 ] 

Gabriel Reid commented on CRUNCH-24:
------------------------------------

It turns out that the issue with the MapsideJoinIT not running on OS X is due to a combination of DistributedCache not really being supported in local mode, and the default temporary directory of all things Hadoop being "/tmp" (in HDFS), while the default temporary directory on Linux is also "/tmp" and on OS X it's something else.

I've attached an updated version of CRUNCH-24-josh.patch which undoes the changes to MapsideJoin, and sets the default temporary directory in MapsideJoinIT. This appears to rectify the issue. I've tested it on OS X, but I don't have a Linux machine handy at the moment to test it there; however, I'm confident that it will work there as well.

@Rahul, I took a look at your patch, and I don't think that it will work when running in distributed mode. There is a call to FileSystem#copyFromLocalFile with a path that is on HDFS when running in distributed mode.
                
> Make test suite suitable for continuous integration
> ---------------------------------------------------
>
>                 Key: CRUNCH-24
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-24
>             Project: Crunch
>          Issue Type: Task
>    Affects Versions: 0.3.0
>            Reporter: Matthias Friedrich
>            Assignee: Matthias Friedrich
>             Fix For: 0.3.0
>
>         Attachments: 0001-CRUNCH-24-Clean-up-test-suite-output.patch, 0001-CRUNCH-24-Distributed-cache.patch, 0001-CRUNCH-24-make-testsuite-sutable-for-CI.patch, 0001-CRUNCH-24-make-testsuite-sutable-for-CI.patch, CRUNCH-24-gabriel.patch, CRUNCH-24-josh.patch
>
>
> Right now the integration test suite leaves about 80 files behind on /tmp making it unsuitable for a shared continuous integration environment. Examples for these files are test case output ("output*") and Crunch's own temporary files ("crunch*", see CRUNCH-21).
> We have to delete these files or make sure they aren't created on /tmp in the first place.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CRUNCH-24) Make test suite suitable for continuous integration

Posted by "Rahul Sharma (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CRUNCH-24?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421982#comment-13421982 ] 

Rahul Sharma commented on CRUNCH-24:
------------------------------------

Thanks Gabriel for the clarification. My patch will not work in the current manner I will correct the patch for CRUNCH-4 issue.
                
> Make test suite suitable for continuous integration
> ---------------------------------------------------
>
>                 Key: CRUNCH-24
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-24
>             Project: Crunch
>          Issue Type: Task
>    Affects Versions: 0.3.0
>            Reporter: Matthias Friedrich
>            Assignee: Matthias Friedrich
>             Fix For: 0.3.0
>
>         Attachments: 0001-CRUNCH-24-Clean-up-test-suite-output.patch, 0001-CRUNCH-24-Distributed-cache.patch, 0001-CRUNCH-24-make-testsuite-sutable-for-CI.patch, 0001-CRUNCH-24-make-testsuite-sutable-for-CI.patch, CRUNCH-24-gabriel.patch, CRUNCH-24-josh.patch
>
>
> Right now the integration test suite leaves about 80 files behind on /tmp making it unsuitable for a shared continuous integration environment. Examples for these files are test case output ("output*") and Crunch's own temporary files ("crunch*", see CRUNCH-21).
> We have to delete these files or make sure they aren't created on /tmp in the first place.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CRUNCH-24) Make test suite suitable for continuous integration

Posted by "Rahul Sharma (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CRUNCH-24?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13420528#comment-13420528 ] 

Rahul Sharma commented on CRUNCH-24:
------------------------------------

I thought it was also fixed but I see I have missed some things. I will upload an updated patch.
                
> Make test suite suitable for continuous integration
> ---------------------------------------------------
>
>                 Key: CRUNCH-24
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-24
>             Project: Crunch
>          Issue Type: Task
>    Affects Versions: 0.3.0
>            Reporter: Matthias Friedrich
>            Assignee: Matthias Friedrich
>             Fix For: 0.3.0
>
>         Attachments: 0001-CRUNCH-24-Clean-up-test-suite-output.patch, 0001-CRUNCH-24-make-testsuite-sutable-for-CI.patch
>
>
> Right now the integration test suite leaves about 80 files behind on /tmp making it unsuitable for a shared continuous integration environment. Examples for these files are test case output ("output*") and Crunch's own temporary files ("crunch*", see CRUNCH-21).
> We have to delete these files or make sure they aren't created on /tmp in the first place.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CRUNCH-24) Make test suite suitable for continuous integration

Posted by "Rahul Sharma (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CRUNCH-24?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421265#comment-13421265 ] 

Rahul Sharma commented on CRUNCH-24:
------------------------------------

It looks like the file did not got copied to cache but was there on local file system on the machine. Then while loading the file, it could not be found in job cache but was there on  OS local path. This would work for local runners but would break on clustered environment. At the first place the file should have been copied to cache. This was the same issue reported when I ran the system with Hadoop 2 version on my linux box. 
                
> Make test suite suitable for continuous integration
> ---------------------------------------------------
>
>                 Key: CRUNCH-24
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-24
>             Project: Crunch
>          Issue Type: Task
>    Affects Versions: 0.3.0
>            Reporter: Matthias Friedrich
>            Assignee: Matthias Friedrich
>             Fix For: 0.3.0
>
>         Attachments: 0001-CRUNCH-24-Clean-up-test-suite-output.patch, 0001-CRUNCH-24-make-testsuite-sutable-for-CI.patch, 0001-CRUNCH-24-make-testsuite-sutable-for-CI.patch, CRUNCH-24-josh.patch
>
>
> Right now the integration test suite leaves about 80 files behind on /tmp making it unsuitable for a shared continuous integration environment. Examples for these files are test case output ("output*") and Crunch's own temporary files ("crunch*", see CRUNCH-21).
> We have to delete these files or make sure they aren't created on /tmp in the first place.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CRUNCH-24) Make test suite suitable for continuous integration

Posted by "Josh Wills (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CRUNCH-24?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13420847#comment-13420847 ] 

Josh Wills commented on CRUNCH-24:
----------------------------------

Yeah, same error.
                
> Make test suite suitable for continuous integration
> ---------------------------------------------------
>
>                 Key: CRUNCH-24
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-24
>             Project: Crunch
>          Issue Type: Task
>    Affects Versions: 0.3.0
>            Reporter: Matthias Friedrich
>            Assignee: Matthias Friedrich
>             Fix For: 0.3.0
>
>         Attachments: 0001-CRUNCH-24-Clean-up-test-suite-output.patch, 0001-CRUNCH-24-make-testsuite-sutable-for-CI.patch, 0001-CRUNCH-24-make-testsuite-sutable-for-CI.patch
>
>
> Right now the integration test suite leaves about 80 files behind on /tmp making it unsuitable for a shared continuous integration environment. Examples for these files are test case output ("output*") and Crunch's own temporary files ("crunch*", see CRUNCH-21).
> We have to delete these files or make sure they aren't created on /tmp in the first place.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CRUNCH-24) Make test suite suitable for continuous integration

Posted by "Matthias Friedrich (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CRUNCH-24?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421765#comment-13421765 ] 

Matthias Friedrich commented on CRUNCH-24:
------------------------------------------

Gabriel, I can confirm that it works on Linux, thank you very much!
                
> Make test suite suitable for continuous integration
> ---------------------------------------------------
>
>                 Key: CRUNCH-24
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-24
>             Project: Crunch
>          Issue Type: Task
>    Affects Versions: 0.3.0
>            Reporter: Matthias Friedrich
>            Assignee: Matthias Friedrich
>             Fix For: 0.3.0
>
>         Attachments: 0001-CRUNCH-24-Clean-up-test-suite-output.patch, 0001-CRUNCH-24-Distributed-cache.patch, 0001-CRUNCH-24-make-testsuite-sutable-for-CI.patch, 0001-CRUNCH-24-make-testsuite-sutable-for-CI.patch, CRUNCH-24-gabriel.patch, CRUNCH-24-josh.patch
>
>
> Right now the integration test suite leaves about 80 files behind on /tmp making it unsuitable for a shared continuous integration environment. Examples for these files are test case output ("output*") and Crunch's own temporary files ("crunch*", see CRUNCH-21).
> We have to delete these files or make sure they aren't created on /tmp in the first place.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira