You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Grant Ingersoll (JIRA)" <ji...@apache.org> on 2011/09/18 16:36:08 UTC

[jira] [Created] (MAHOUT-814) LocalSSDSolver tests should use their own tmp space to avoid collisions

LocalSSDSolver tests should use their own tmp space to avoid collisions
-----------------------------------------------------------------------

                 Key: MAHOUT-814
                 URL: https://issues.apache.org/jira/browse/MAHOUT-814
             Project: Mahout
          Issue Type: Bug
            Reporter: Grant Ingersoll
            Priority: Minor


Running Mahout in an environment with Jenkins also running and am getting:
{quote}
java.io.FileNotFoundException: /tmp/q-temp.seq (Permission denied)
        at java.io.FileOutputStream.open(Native Method)
        at java.io.FileOutputStream.<init>(FileOutputStream.java:209)
        at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:187)
        at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:183)
        at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:241)
        at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:335)
        at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:368)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:528)
        at org.apache.hadoop.io.SequenceFile$BlockCompressWriter.<init>(SequenceFile.java:1198)
        at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:401)
        at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:284)
        at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.getTempQw(QRFirstStep.java:263)
        at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.flushSolver(QRFirstStep.java:104)
        at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.map(QRFirstStep.java:175)
        at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.collect(QRFirstStep.java:279)
        at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:142)
        at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:71)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
        at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
{quote}

Also seeing the following tests fail:
{quote}

Tests in error: 
  testSSVDSolverSparse(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
  testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
  testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
  testSSVDSolverDense(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
{quote}

I haven't checked all of them, but I suspect they are all due to the same reason.  We should dynamically create a temp area for each test using temporary directories under the main temp dir.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-814) QRFirstStep should use their own tmp space to avoid collisions

Posted by "Dmitriy Lyubimov (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13107529#comment-13107529 ] 

Dmitriy Lyubimov commented on MAHOUT-814:
-----------------------------------------

they used to use their own space. But I think Sebastian did a fix to use some methods from abstract mahout test class an
d now they land somewhere else. I can take a look, provided I know what the best practice really is for using temp directories for tests using local Mr jobs.

> QRFirstStep should use their own tmp space to avoid collisions
> --------------------------------------------------------------
>
>                 Key: MAHOUT-814
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-814
>             Project: Mahout
>          Issue Type: Bug
>            Reporter: Grant Ingersoll
>            Priority: Minor
>
> Running Mahout in an environment with Jenkins also running and am getting:
> {quote}
> java.io.FileNotFoundException: /tmp/q-temp.seq (Permission denied)
>         at java.io.FileOutputStream.open(Native Method)
>         at java.io.FileOutputStream.<init>(FileOutputStream.java:209)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:187)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:183)
>         at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:241)
>         at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:335)
>         at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:368)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:528)
>         at org.apache.hadoop.io.SequenceFile$BlockCompressWriter.<init>(SequenceFile.java:1198)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:401)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:284)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.getTempQw(QRFirstStep.java:263)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.flushSolver(QRFirstStep.java:104)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.map(QRFirstStep.java:175)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.collect(QRFirstStep.java:279)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:142)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:71)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>         at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> {quote}
> Also seeing the following tests fail:
> {quote}
> Tests in error: 
>   testSSVDSolverSparse(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
>   testSSVDSolverDense(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
> {quote}
> I haven't checked all of them, but I suspect they are all due to the same reason.  We should dynamically create a temp area for each test using temporary directories under the main temp dir.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-814) SSVD local tests should use their own tmp space to avoid collisions

Posted by "Dmitriy Lyubimov (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13108858#comment-13108858 ] 

Dmitriy Lyubimov commented on MAHOUT-814:
-----------------------------------------

Actually, from what i am reading, even java.io.tmpdir may be shared between different tasks even in distributed mode in case jvm sharing is enabled (forget the property name, but i use it also at value of ~5-10 to speed up small tasks setup). 

In which case using java.io.tmp/q-temp may potentially be a problem as well. 

+What i propose is it probably would be safer to use a directory *$java.io.tmp/$taskAttemptID* which would guarantee task uniqueness (but perhaps not with local tests, if it is not unique with local tests, i will add some random numbers).+

{panel}
A little background: having to write a temporary file in this task is a corner case only arising when Q block height is smaller than the number of input rows of A coming in, which should never be the case with normal block sizes but may be a case with minSplitSize splits set at 1G or something, or if A input is extremely sparse (such as one non-zero element per row on average, then yeah, Q blocks, which are k+p wide (the number of eigenvalues requested), which is not a very good use case for this method, i'd rather try to transpose first to see if it helps row-wise sparsity). 

The test however is set up intentionally the way that Q block height is set extremely small to test both blocking within a split and among the splits.
{panel}

> SSVD local tests should use their own tmp space to avoid collisions
> -------------------------------------------------------------------
>
>                 Key: MAHOUT-814
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-814
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.5
>            Reporter: Grant Ingersoll
>            Assignee: Dmitriy Lyubimov
>            Priority: Minor
>             Fix For: 0.6
>
>         Attachments: MAHOUT-814.patch
>
>
> Running Mahout in an environment with Jenkins also running and am getting:
> {quote}
> java.io.FileNotFoundException: /tmp/q-temp.seq (Permission denied)
>         at java.io.FileOutputStream.open(Native Method)
>         at java.io.FileOutputStream.<init>(FileOutputStream.java:209)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:187)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:183)
>         at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:241)
>         at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:335)
>         at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:368)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:528)
>         at org.apache.hadoop.io.SequenceFile$BlockCompressWriter.<init>(SequenceFile.java:1198)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:401)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:284)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.getTempQw(QRFirstStep.java:263)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.flushSolver(QRFirstStep.java:104)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.map(QRFirstStep.java:175)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.collect(QRFirstStep.java:279)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:142)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:71)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>         at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> {quote}
> Also seeing the following tests fail:
> {quote}
> Tests in error: 
>   testSSVDSolverSparse(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
>   testSSVDSolverDense(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
> {quote}
> I haven't checked all of them, but I suspect they are all due to the same reason.  We should dynamically create a temp area for each test using temporary directories under the main temp dir.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAHOUT-814) SSVD local tests should use their own tmp space to avoid collisions

Posted by "Sean Owen (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sean Owen updated MAHOUT-814:
-----------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)
    
> SSVD local tests should use their own tmp space to avoid collisions
> -------------------------------------------------------------------
>
>                 Key: MAHOUT-814
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-814
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.5
>            Reporter: Grant Ingersoll
>            Assignee: Dmitriy Lyubimov
>            Priority: Minor
>             Fix For: 0.6
>
>         Attachments: MAHOUT-814-1.patch, MAHOUT-814.patch
>
>
> Running Mahout in an environment with Jenkins also running and am getting:
> {quote}
> java.io.FileNotFoundException: /tmp/q-temp.seq (Permission denied)
>         at java.io.FileOutputStream.open(Native Method)
>         at java.io.FileOutputStream.<init>(FileOutputStream.java:209)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:187)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:183)
>         at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:241)
>         at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:335)
>         at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:368)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:528)
>         at org.apache.hadoop.io.SequenceFile$BlockCompressWriter.<init>(SequenceFile.java:1198)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:401)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:284)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.getTempQw(QRFirstStep.java:263)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.flushSolver(QRFirstStep.java:104)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.map(QRFirstStep.java:175)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.collect(QRFirstStep.java:279)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:142)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:71)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>         at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> {quote}
> Also seeing the following tests fail:
> {quote}
> Tests in error: 
>   testSSVDSolverSparse(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
>   testSSVDSolverDense(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
> {quote}
> I haven't checked all of them, but I suspect they are all due to the same reason.  We should dynamically create a temp area for each test using temporary directories under the main temp dir.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-814) SSVD local tests should use their own tmp space to avoid collisions

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13113130#comment-13113130 ] 

Hudson commented on MAHOUT-814:
-------------------------------

Integrated in Mahout-Quality #1057 (See [https://builds.apache.org/job/Mahout-Quality/1057/])
    MAHOUT-814 (patch rev.1)

dlyubimov : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1174452
Files : 
* /mahout/trunk/core/src/main/java/org/apache/mahout/math/hadoop/stochasticsvd/qr/QRFirstStep.java
* /mahout/trunk/core/src/test/java/org/apache/mahout/math/hadoop/stochasticsvd/LocalSSVDSolverDenseTest.java


> SSVD local tests should use their own tmp space to avoid collisions
> -------------------------------------------------------------------
>
>                 Key: MAHOUT-814
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-814
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.5
>            Reporter: Grant Ingersoll
>            Assignee: Dmitriy Lyubimov
>            Priority: Minor
>             Fix For: 0.6
>
>         Attachments: MAHOUT-814-1.patch, MAHOUT-814.patch
>
>
> Running Mahout in an environment with Jenkins also running and am getting:
> {quote}
> java.io.FileNotFoundException: /tmp/q-temp.seq (Permission denied)
>         at java.io.FileOutputStream.open(Native Method)
>         at java.io.FileOutputStream.<init>(FileOutputStream.java:209)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:187)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:183)
>         at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:241)
>         at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:335)
>         at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:368)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:528)
>         at org.apache.hadoop.io.SequenceFile$BlockCompressWriter.<init>(SequenceFile.java:1198)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:401)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:284)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.getTempQw(QRFirstStep.java:263)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.flushSolver(QRFirstStep.java:104)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.map(QRFirstStep.java:175)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.collect(QRFirstStep.java:279)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:142)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:71)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>         at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> {quote}
> Also seeing the following tests fail:
> {quote}
> Tests in error: 
>   testSSVDSolverSparse(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
>   testSSVDSolverDense(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
> {quote}
> I haven't checked all of them, but I suspect they are all due to the same reason.  We should dynamically create a temp area for each test using temporary directories under the main temp dir.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-814) SSVD local tests should use their own tmp space to avoid collisions

Posted by "Dmitriy Lyubimov (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13107534#comment-13107534 ] 

Dmitriy Lyubimov commented on MAHOUT-814:
-----------------------------------------

oh I think this particular thing is different. There's semiofficial rule in map red how to access task temporary space.  Maoreduce boxes each task in a way that its temporary space can be accessed thru a system variable.

Apparently, local Mr spends much less effort to box a task properly. We can probably figure a workaround for the local mode task boxing.


> SSVD local tests should use their own tmp space to avoid collisions
> -------------------------------------------------------------------
>
>                 Key: MAHOUT-814
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-814
>             Project: Mahout
>          Issue Type: Bug
>            Reporter: Grant Ingersoll
>            Priority: Minor
>
> Running Mahout in an environment with Jenkins also running and am getting:
> {quote}
> java.io.FileNotFoundException: /tmp/q-temp.seq (Permission denied)
>         at java.io.FileOutputStream.open(Native Method)
>         at java.io.FileOutputStream.<init>(FileOutputStream.java:209)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:187)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:183)
>         at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:241)
>         at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:335)
>         at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:368)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:528)
>         at org.apache.hadoop.io.SequenceFile$BlockCompressWriter.<init>(SequenceFile.java:1198)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:401)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:284)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.getTempQw(QRFirstStep.java:263)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.flushSolver(QRFirstStep.java:104)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.map(QRFirstStep.java:175)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.collect(QRFirstStep.java:279)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:142)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:71)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>         at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> {quote}
> Also seeing the following tests fail:
> {quote}
> Tests in error: 
>   testSSVDSolverSparse(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
>   testSSVDSolverDense(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
> {quote}
> I haven't checked all of them, but I suspect they are all due to the same reason.  We should dynamically create a temp area for each test using temporary directories under the main temp dir.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-814) SSVD local tests should use their own tmp space to avoid collisions

Posted by "Dmitriy Lyubimov (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13109767#comment-13109767 ] 

Dmitriy Lyubimov commented on MAHOUT-814:
-----------------------------------------

Nathan, your issue is now resolved in MAHOUT-816 (pending commit)

> SSVD local tests should use their own tmp space to avoid collisions
> -------------------------------------------------------------------
>
>                 Key: MAHOUT-814
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-814
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.5
>            Reporter: Grant Ingersoll
>            Assignee: Dmitriy Lyubimov
>            Priority: Minor
>             Fix For: 0.6
>
>         Attachments: MAHOUT-814.patch
>
>
> Running Mahout in an environment with Jenkins also running and am getting:
> {quote}
> java.io.FileNotFoundException: /tmp/q-temp.seq (Permission denied)
>         at java.io.FileOutputStream.open(Native Method)
>         at java.io.FileOutputStream.<init>(FileOutputStream.java:209)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:187)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:183)
>         at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:241)
>         at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:335)
>         at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:368)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:528)
>         at org.apache.hadoop.io.SequenceFile$BlockCompressWriter.<init>(SequenceFile.java:1198)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:401)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:284)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.getTempQw(QRFirstStep.java:263)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.flushSolver(QRFirstStep.java:104)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.map(QRFirstStep.java:175)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.collect(QRFirstStep.java:279)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:142)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:71)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>         at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> {quote}
> Also seeing the following tests fail:
> {quote}
> Tests in error: 
>   testSSVDSolverSparse(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
>   testSSVDSolverDense(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
> {quote}
> I haven't checked all of them, but I suspect they are all due to the same reason.  We should dynamically create a temp area for each test using temporary directories under the main temp dir.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-814) SSVD local tests should use their own tmp space to avoid collisions

Posted by "Dmitriy Lyubimov (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13108328#comment-13108328 ] 

Dmitriy Lyubimov commented on MAHOUT-814:
-----------------------------------------

There's also some info floating around that job.local.dir property may be a better candidate for task's scratch space than java.io.tmp.



> SSVD local tests should use their own tmp space to avoid collisions
> -------------------------------------------------------------------
>
>                 Key: MAHOUT-814
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-814
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.5
>            Reporter: Grant Ingersoll
>            Assignee: Dmitriy Lyubimov
>            Priority: Minor
>             Fix For: 0.6
>
>         Attachments: MAHOUT-814.patch
>
>
> Running Mahout in an environment with Jenkins also running and am getting:
> {quote}
> java.io.FileNotFoundException: /tmp/q-temp.seq (Permission denied)
>         at java.io.FileOutputStream.open(Native Method)
>         at java.io.FileOutputStream.<init>(FileOutputStream.java:209)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:187)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:183)
>         at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:241)
>         at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:335)
>         at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:368)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:528)
>         at org.apache.hadoop.io.SequenceFile$BlockCompressWriter.<init>(SequenceFile.java:1198)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:401)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:284)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.getTempQw(QRFirstStep.java:263)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.flushSolver(QRFirstStep.java:104)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.map(QRFirstStep.java:175)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.collect(QRFirstStep.java:279)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:142)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:71)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>         at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> {quote}
> Also seeing the following tests fail:
> {quote}
> Tests in error: 
>   testSSVDSolverSparse(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
>   testSSVDSolverDense(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
> {quote}
> I haven't checked all of them, but I suspect they are all due to the same reason.  We should dynamically create a temp area for each test using temporary directories under the main temp dir.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAHOUT-814) SSVD local tests should use their own tmp space to avoid collisions

Posted by "Dmitriy Lyubimov (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dmitriy Lyubimov updated MAHOUT-814:
------------------------------------

    Summary: SSVD local tests should use their own tmp space to avoid collisions  (was: QRFirstStep should use their own tmp space to avoid collisions)

> SSVD local tests should use their own tmp space to avoid collisions
> -------------------------------------------------------------------
>
>                 Key: MAHOUT-814
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-814
>             Project: Mahout
>          Issue Type: Bug
>            Reporter: Grant Ingersoll
>            Priority: Minor
>
> Running Mahout in an environment with Jenkins also running and am getting:
> {quote}
> java.io.FileNotFoundException: /tmp/q-temp.seq (Permission denied)
>         at java.io.FileOutputStream.open(Native Method)
>         at java.io.FileOutputStream.<init>(FileOutputStream.java:209)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:187)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:183)
>         at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:241)
>         at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:335)
>         at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:368)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:528)
>         at org.apache.hadoop.io.SequenceFile$BlockCompressWriter.<init>(SequenceFile.java:1198)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:401)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:284)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.getTempQw(QRFirstStep.java:263)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.flushSolver(QRFirstStep.java:104)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.map(QRFirstStep.java:175)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.collect(QRFirstStep.java:279)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:142)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:71)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>         at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> {quote}
> Also seeing the following tests fail:
> {quote}
> Tests in error: 
>   testSSVDSolverSparse(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
>   testSSVDSolverDense(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
> {quote}
> I haven't checked all of them, but I suspect they are all due to the same reason.  We should dynamically create a temp area for each test using temporary directories under the main temp dir.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAHOUT-814) QRFirstStep should use their own tmp space to avoid collisions

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Grant Ingersoll updated MAHOUT-814:
-----------------------------------

    Summary: QRFirstStep should use their own tmp space to avoid collisions  (was: LocalSSDSolver tests should use their own tmp space to avoid collisions)

> QRFirstStep should use their own tmp space to avoid collisions
> --------------------------------------------------------------
>
>                 Key: MAHOUT-814
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-814
>             Project: Mahout
>          Issue Type: Bug
>            Reporter: Grant Ingersoll
>            Priority: Minor
>
> Running Mahout in an environment with Jenkins also running and am getting:
> {quote}
> java.io.FileNotFoundException: /tmp/q-temp.seq (Permission denied)
>         at java.io.FileOutputStream.open(Native Method)
>         at java.io.FileOutputStream.<init>(FileOutputStream.java:209)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:187)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:183)
>         at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:241)
>         at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:335)
>         at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:368)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:528)
>         at org.apache.hadoop.io.SequenceFile$BlockCompressWriter.<init>(SequenceFile.java:1198)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:401)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:284)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.getTempQw(QRFirstStep.java:263)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.flushSolver(QRFirstStep.java:104)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.map(QRFirstStep.java:175)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.collect(QRFirstStep.java:279)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:142)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:71)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>         at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> {quote}
> Also seeing the following tests fail:
> {quote}
> Tests in error: 
>   testSSVDSolverSparse(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
>   testSSVDSolverDense(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
> {quote}
> I haven't checked all of them, but I suspect they are all due to the same reason.  We should dynamically create a temp area for each test using temporary directories under the main temp dir.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-814) SSVD local tests should use their own tmp space to avoid collisions

Posted by "Dmitriy Lyubimov (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13113020#comment-13113020 ] 

Dmitriy Lyubimov commented on MAHOUT-814:
-----------------------------------------

Ok, i take it there's no objections, then i will commit it. it is a simple patch, and that's what Grant wanted...

> SSVD local tests should use their own tmp space to avoid collisions
> -------------------------------------------------------------------
>
>                 Key: MAHOUT-814
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-814
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.5
>            Reporter: Grant Ingersoll
>            Assignee: Dmitriy Lyubimov
>            Priority: Minor
>             Fix For: 0.6
>
>         Attachments: MAHOUT-814-1.patch, MAHOUT-814.patch
>
>
> Running Mahout in an environment with Jenkins also running and am getting:
> {quote}
> java.io.FileNotFoundException: /tmp/q-temp.seq (Permission denied)
>         at java.io.FileOutputStream.open(Native Method)
>         at java.io.FileOutputStream.<init>(FileOutputStream.java:209)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:187)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:183)
>         at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:241)
>         at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:335)
>         at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:368)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:528)
>         at org.apache.hadoop.io.SequenceFile$BlockCompressWriter.<init>(SequenceFile.java:1198)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:401)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:284)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.getTempQw(QRFirstStep.java:263)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.flushSolver(QRFirstStep.java:104)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.map(QRFirstStep.java:175)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.collect(QRFirstStep.java:279)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:142)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:71)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>         at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> {quote}
> Also seeing the following tests fail:
> {quote}
> Tests in error: 
>   testSSVDSolverSparse(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
>   testSSVDSolverDense(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
> {quote}
> I haven't checked all of them, but I suspect they are all due to the same reason.  We should dynamically create a temp area for each test using temporary directories under the main temp dir.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-814) SSVD local tests should use their own tmp space to avoid collisions

Posted by "Dmitriy Lyubimov (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13109725#comment-13109725 ] 

Dmitriy Lyubimov commented on MAHOUT-814:
-----------------------------------------

no it isn't related. Good catch. q=1 has put such a good showing i never tried q>1, not even in local tests. That should be a typo-kind fix.  

> SSVD local tests should use their own tmp space to avoid collisions
> -------------------------------------------------------------------
>
>                 Key: MAHOUT-814
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-814
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.5
>            Reporter: Grant Ingersoll
>            Assignee: Dmitriy Lyubimov
>            Priority: Minor
>             Fix For: 0.6
>
>         Attachments: MAHOUT-814.patch
>
>
> Running Mahout in an environment with Jenkins also running and am getting:
> {quote}
> java.io.FileNotFoundException: /tmp/q-temp.seq (Permission denied)
>         at java.io.FileOutputStream.open(Native Method)
>         at java.io.FileOutputStream.<init>(FileOutputStream.java:209)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:187)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:183)
>         at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:241)
>         at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:335)
>         at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:368)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:528)
>         at org.apache.hadoop.io.SequenceFile$BlockCompressWriter.<init>(SequenceFile.java:1198)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:401)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:284)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.getTempQw(QRFirstStep.java:263)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.flushSolver(QRFirstStep.java:104)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.map(QRFirstStep.java:175)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.collect(QRFirstStep.java:279)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:142)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:71)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>         at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> {quote}
> Also seeing the following tests fail:
> {quote}
> Tests in error: 
>   testSSVDSolverSparse(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
>   testSSVDSolverDense(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
> {quote}
> I haven't checked all of them, but I suspect they are all due to the same reason.  We should dynamically create a temp area for each test using temporary directories under the main temp dir.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAHOUT-814) SSVD local tests should use their own tmp space to avoid collisions

Posted by "Dmitriy Lyubimov (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dmitriy Lyubimov updated MAHOUT-814:
------------------------------------

    Attachment: MAHOUT-814.patch

One hack that can probably overcome this. i am still not convinced that's the best way to do it, but at least the task wouldn't be writing directly into java.io.tmpdir when specifically requested not to.



> SSVD local tests should use their own tmp space to avoid collisions
> -------------------------------------------------------------------
>
>                 Key: MAHOUT-814
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-814
>             Project: Mahout
>          Issue Type: Bug
>            Reporter: Grant Ingersoll
>            Assignee: Dmitriy Lyubimov
>            Priority: Minor
>         Attachments: MAHOUT-814.patch
>
>
> Running Mahout in an environment with Jenkins also running and am getting:
> {quote}
> java.io.FileNotFoundException: /tmp/q-temp.seq (Permission denied)
>         at java.io.FileOutputStream.open(Native Method)
>         at java.io.FileOutputStream.<init>(FileOutputStream.java:209)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:187)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:183)
>         at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:241)
>         at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:335)
>         at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:368)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:528)
>         at org.apache.hadoop.io.SequenceFile$BlockCompressWriter.<init>(SequenceFile.java:1198)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:401)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:284)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.getTempQw(QRFirstStep.java:263)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.flushSolver(QRFirstStep.java:104)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.map(QRFirstStep.java:175)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.collect(QRFirstStep.java:279)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:142)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:71)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>         at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> {quote}
> Also seeing the following tests fail:
> {quote}
> Tests in error: 
>   testSSVDSolverSparse(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
>   testSSVDSolverDense(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
> {quote}
> I haven't checked all of them, but I suspect they are all due to the same reason.  We should dynamically create a temp area for each test using temporary directories under the main temp dir.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-814) SSVD local tests should use their own tmp space to avoid collisions

Posted by "Nathan Halko (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13109689#comment-13109689 ] 

Nathan Halko commented on MAHOUT-814:
-------------------------------------

Not sure it this is related, but sounds similar.  I can't run more than one power iteration, ie q=2 produces

11/09/21 11:25:46 INFO mapred.LocalJobRunner: reduce > reduce
11/09/21 11:25:46 INFO mapred.Task: Task 'attempt_local_0004_r_000000_0' done.
11/09/21 11:25:50 INFO mapred.JobClient: Cleaning up the staging area file:/tmp/hadoop-nathanhalko/mapred/staging/nathanhalko-200181280/.staging/job_local_0005
Exception in thread "main" org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory temp/ABt-job-1 already exists
	at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:134)
	at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:830)
	at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:791)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
	at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:791)
	at org.apache.hadoop.mapreduce.Job.submit(Job.java:465)
	at org.apache.mahout.math.hadoop.stochasticsvd.ABtJob.run(ABtJob.java:454)
	at org.apache.mahout.math.hadoop.stochasticsvd.SSVDSolver.run(SSVDSolver.java:312)
	at org.apache.mahout.math.hadoop.stochasticsvd.SSVDCli.run(SSVDCli.java:118)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
	at org.apache.mahout.math.hadoop.stochasticsvd.SSVDCli.main(SSVDCli.java:163)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
	at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
	at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:188)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

for q=0,1 everything works fine.  I am running with --overwrite and I rm -rf the temp dir before running.

> SSVD local tests should use their own tmp space to avoid collisions
> -------------------------------------------------------------------
>
>                 Key: MAHOUT-814
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-814
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.5
>            Reporter: Grant Ingersoll
>            Assignee: Dmitriy Lyubimov
>            Priority: Minor
>             Fix For: 0.6
>
>         Attachments: MAHOUT-814.patch
>
>
> Running Mahout in an environment with Jenkins also running and am getting:
> {quote}
> java.io.FileNotFoundException: /tmp/q-temp.seq (Permission denied)
>         at java.io.FileOutputStream.open(Native Method)
>         at java.io.FileOutputStream.<init>(FileOutputStream.java:209)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:187)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:183)
>         at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:241)
>         at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:335)
>         at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:368)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:528)
>         at org.apache.hadoop.io.SequenceFile$BlockCompressWriter.<init>(SequenceFile.java:1198)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:401)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:284)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.getTempQw(QRFirstStep.java:263)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.flushSolver(QRFirstStep.java:104)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.map(QRFirstStep.java:175)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.collect(QRFirstStep.java:279)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:142)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:71)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>         at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> {quote}
> Also seeing the following tests fail:
> {quote}
> Tests in error: 
>   testSSVDSolverSparse(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
>   testSSVDSolverDense(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
> {quote}
> I haven't checked all of them, but I suspect they are all due to the same reason.  We should dynamically create a temp area for each test using temporary directories under the main temp dir.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-814) SSVD local tests should use their own tmp space to avoid collisions

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13107567#comment-13107567 ] 

Grant Ingersoll commented on MAHOUT-814:
----------------------------------------

Yeah, it is different.  It's not actually in the test, it just manifested itself for me via the test.

> SSVD local tests should use their own tmp space to avoid collisions
> -------------------------------------------------------------------
>
>                 Key: MAHOUT-814
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-814
>             Project: Mahout
>          Issue Type: Bug
>            Reporter: Grant Ingersoll
>            Priority: Minor
>
> Running Mahout in an environment with Jenkins also running and am getting:
> {quote}
> java.io.FileNotFoundException: /tmp/q-temp.seq (Permission denied)
>         at java.io.FileOutputStream.open(Native Method)
>         at java.io.FileOutputStream.<init>(FileOutputStream.java:209)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:187)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:183)
>         at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:241)
>         at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:335)
>         at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:368)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:528)
>         at org.apache.hadoop.io.SequenceFile$BlockCompressWriter.<init>(SequenceFile.java:1198)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:401)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:284)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.getTempQw(QRFirstStep.java:263)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.flushSolver(QRFirstStep.java:104)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.map(QRFirstStep.java:175)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.collect(QRFirstStep.java:279)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:142)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:71)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>         at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> {quote}
> Also seeing the following tests fail:
> {quote}
> Tests in error: 
>   testSSVDSolverSparse(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
>   testSSVDSolverDense(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
> {quote}
> I haven't checked all of them, but I suspect they are all due to the same reason.  We should dynamically create a temp area for each test using temporary directories under the main temp dir.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-814) SSVD local tests should use their own tmp space to avoid collisions

Posted by "Sean Owen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13108392#comment-13108392 ] 

Sean Owen commented on MAHOUT-814:
----------------------------------

Your analysis is completely correct. java.io.tmpdir is more than that -- it's the definition of "writable temp space" for Java itself. There should be nothing wrong with using this; in fact, it's the only reliable temp space available.

1. If it's not writable, that's a problem with the host machine. As you might have noticed, Jenkins seems to be quite flaky; a few times a week it will fail for some internal machine-specific reason.

2. However I too am not sure that's the issue; it could still be a clash somehow. Can you change the job that uses q-temp.seq to stash it in a file that maybe includes the timestamp? like q-temp-125095095090.seq. In general I think this is a good strategy, and why the test framework does exactly this.

It's either nothing for us to fix (1), or, I think, a case of making temp files more unique (2). You shouldn't have to do anything more complex.

> SSVD local tests should use their own tmp space to avoid collisions
> -------------------------------------------------------------------
>
>                 Key: MAHOUT-814
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-814
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.5
>            Reporter: Grant Ingersoll
>            Assignee: Dmitriy Lyubimov
>            Priority: Minor
>             Fix For: 0.6
>
>         Attachments: MAHOUT-814.patch
>
>
> Running Mahout in an environment with Jenkins also running and am getting:
> {quote}
> java.io.FileNotFoundException: /tmp/q-temp.seq (Permission denied)
>         at java.io.FileOutputStream.open(Native Method)
>         at java.io.FileOutputStream.<init>(FileOutputStream.java:209)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:187)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:183)
>         at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:241)
>         at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:335)
>         at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:368)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:528)
>         at org.apache.hadoop.io.SequenceFile$BlockCompressWriter.<init>(SequenceFile.java:1198)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:401)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:284)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.getTempQw(QRFirstStep.java:263)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.flushSolver(QRFirstStep.java:104)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.map(QRFirstStep.java:175)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.collect(QRFirstStep.java:279)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:142)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:71)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>         at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> {quote}
> Also seeing the following tests fail:
> {quote}
> Tests in error: 
>   testSSVDSolverSparse(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
>   testSSVDSolverDense(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
> {quote}
> I haven't checked all of them, but I suspect they are all due to the same reason.  We should dynamically create a temp area for each test using temporary directories under the main temp dir.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (MAHOUT-814) SSVD local tests should use their own tmp space to avoid collisions

Posted by "Dmitriy Lyubimov (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13108319#comment-13108319 ] 

Dmitriy Lyubimov edited comment on MAHOUT-814 at 9/20/11 1:51 AM:
------------------------------------------------------------------

One hack that can probably overcome this. i am still not convinced that's the best way to do it, but at least the task wouldn't be writing directly into java.io.tmpdir when specifically requested not to.

Github snapshot is here: https://github.com/dlyubimov/mahout-commits/tree/MAHOUT-814


      was (Author: dlyubimov):
    One hack that can probably overcome this. i am still not convinced that's the best way to do it, but at least the task wouldn't be writing directly into java.io.tmpdir when specifically requested not to.


  
> SSVD local tests should use their own tmp space to avoid collisions
> -------------------------------------------------------------------
>
>                 Key: MAHOUT-814
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-814
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.5
>            Reporter: Grant Ingersoll
>            Assignee: Dmitriy Lyubimov
>            Priority: Minor
>             Fix For: 0.6
>
>         Attachments: MAHOUT-814.patch
>
>
> Running Mahout in an environment with Jenkins also running and am getting:
> {quote}
> java.io.FileNotFoundException: /tmp/q-temp.seq (Permission denied)
>         at java.io.FileOutputStream.open(Native Method)
>         at java.io.FileOutputStream.<init>(FileOutputStream.java:209)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:187)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:183)
>         at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:241)
>         at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:335)
>         at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:368)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:528)
>         at org.apache.hadoop.io.SequenceFile$BlockCompressWriter.<init>(SequenceFile.java:1198)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:401)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:284)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.getTempQw(QRFirstStep.java:263)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.flushSolver(QRFirstStep.java:104)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.map(QRFirstStep.java:175)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.collect(QRFirstStep.java:279)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:142)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:71)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>         at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> {quote}
> Also seeing the following tests fail:
> {quote}
> Tests in error: 
>   testSSVDSolverSparse(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
>   testSSVDSolverDense(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
> {quote}
> I haven't checked all of them, but I suspect they are all due to the same reason.  We should dynamically create a temp area for each test using temporary directories under the main temp dir.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (MAHOUT-814) SSVD local tests should use their own tmp space to avoid collisions

Posted by "Dmitriy Lyubimov (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13108858#comment-13108858 ] 

Dmitriy Lyubimov edited comment on MAHOUT-814 at 9/20/11 6:03 PM:
------------------------------------------------------------------

Actually, from what i am reading, even java.io.tmpdir may be shared between different tasks even in distributed mode in case jvm sharing is enabled (forget the property name, but i use it also at value of ~5-10 to speed up small tasks setup). 

In which case using java.io.tmp/q-temp may potentially be a problem as well. 

+What i propose is it probably would be safer to use a directory *$java.io.tmp/$taskAttemptID* which would guarantee task uniqueness (but perhaps not with local tests, if it is not unique with local tests, i will add some random numbers).+

{panel}
A little background: having to write a temporary file in this task is a corner case only arising when Q block height set is smaller than the number of input rows of A coming in in one split, which should never be the case with normal block sizes but may be a case with minSplitSize splits set at 1G or something, or if A input is extremely sparse (such as one non-zero element per row on average, then yeah, Q blocks, which are k+p wide (the number of eigenvalues requested), could be rather memory intensive, which is not a very good use case for this method, i'd rather try to transpose first to see if it helps row-wise sparsity). 

Only with those corner cases out-of-core version of task's local computation of what i call Q-hat blocks is used. But that pass is always sequential so it should capitalize on sequential read speeds and potentially OS I/O cache if there's enough memory installed.

The test however is set up intentionally the way that Q block height is set extremely small to test both blocking within a split and among the splits.
{panel}

      was (Author: dlyubimov):
    Actually, from what i am reading, even java.io.tmpdir may be shared between different tasks even in distributed mode in case jvm sharing is enabled (forget the property name, but i use it also at value of ~5-10 to speed up small tasks setup). 

In which case using java.io.tmp/q-temp may potentially be a problem as well. 

+What i propose is it probably would be safer to use a directory *$java.io.tmp/$taskAttemptID* which would guarantee task uniqueness (but perhaps not with local tests, if it is not unique with local tests, i will add some random numbers).+

{panel}
A little background: having to write a temporary file in this task is a corner case only arising when Q block height is smaller than the number of input rows of A coming in, which should never be the case with normal block sizes but may be a case with minSplitSize splits set at 1G or something, or if A input is extremely sparse (such as one non-zero element per row on average, then yeah, Q blocks, which are k+p wide (the number of eigenvalues requested), could be rather memory intensive, which is not a very good use case for this method, i'd rather try to transpose first to see if it helps row-wise sparsity). 

Only with those corner cases out-of-core version of task's local computation of what i call Q-hat blocks is used. But that pass is always sequential so it should capitalize on sequential read speeds and potentially OS I/O cache if there's enough memory installed.

The test however is set up intentionally the way that Q block height is set extremely small to test both blocking within a split and among the splits.
{panel}
  
> SSVD local tests should use their own tmp space to avoid collisions
> -------------------------------------------------------------------
>
>                 Key: MAHOUT-814
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-814
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.5
>            Reporter: Grant Ingersoll
>            Assignee: Dmitriy Lyubimov
>            Priority: Minor
>             Fix For: 0.6
>
>         Attachments: MAHOUT-814.patch
>
>
> Running Mahout in an environment with Jenkins also running and am getting:
> {quote}
> java.io.FileNotFoundException: /tmp/q-temp.seq (Permission denied)
>         at java.io.FileOutputStream.open(Native Method)
>         at java.io.FileOutputStream.<init>(FileOutputStream.java:209)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:187)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:183)
>         at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:241)
>         at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:335)
>         at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:368)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:528)
>         at org.apache.hadoop.io.SequenceFile$BlockCompressWriter.<init>(SequenceFile.java:1198)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:401)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:284)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.getTempQw(QRFirstStep.java:263)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.flushSolver(QRFirstStep.java:104)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.map(QRFirstStep.java:175)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.collect(QRFirstStep.java:279)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:142)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:71)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>         at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> {quote}
> Also seeing the following tests fail:
> {quote}
> Tests in error: 
>   testSSVDSolverSparse(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
>   testSSVDSolverDense(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
> {quote}
> I haven't checked all of them, but I suspect they are all due to the same reason.  We should dynamically create a temp area for each test using temporary directories under the main temp dir.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (MAHOUT-814) SSVD local tests should use their own tmp space to avoid collisions

Posted by "Dmitriy Lyubimov (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13108328#comment-13108328 ] 

Dmitriy Lyubimov edited comment on MAHOUT-814 at 9/20/11 2:37 AM:
------------------------------------------------------------------

another workaround, per tutorial, seems to be to construct this task per the rule below but does it have to be that complicated just to access task scratchpad path? besides, i looked at the history of this and it really subject to change. So using java.io.tmpdir really seems like the safest bet. 

{quote}

$${mapred.local.dir}/taskTracker/jobcache/$jobid/$taskid/work/tmp : The temporary directory for the task. (User can specify the property mapred.child.tmp to set the value of temporary directory for map and reduce tasks. This defaults to ./tmp. If the value is not an absolute path, it is prepended with task's working directory. Otherwise, it is directly assigned. The directory will be created if it doesn't exist. Then, the child java tasks are executed with option -Djava.io.tmpdir='the absolute path of the tmp dir'. Anp pipes and streaming are set with environment variable, TMPDIR='the absolute path of the tmp dir'). This directory is created, if mapred.child.tmp has the value ./tmp


{quote}

      was (Author: dlyubimov):
    There's also some info floating around that job.local.dir property may be a better candidate for task's scratch space than java.io.tmp.


  
> SSVD local tests should use their own tmp space to avoid collisions
> -------------------------------------------------------------------
>
>                 Key: MAHOUT-814
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-814
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.5
>            Reporter: Grant Ingersoll
>            Assignee: Dmitriy Lyubimov
>            Priority: Minor
>             Fix For: 0.6
>
>         Attachments: MAHOUT-814.patch
>
>
> Running Mahout in an environment with Jenkins also running and am getting:
> {quote}
> java.io.FileNotFoundException: /tmp/q-temp.seq (Permission denied)
>         at java.io.FileOutputStream.open(Native Method)
>         at java.io.FileOutputStream.<init>(FileOutputStream.java:209)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:187)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:183)
>         at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:241)
>         at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:335)
>         at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:368)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:528)
>         at org.apache.hadoop.io.SequenceFile$BlockCompressWriter.<init>(SequenceFile.java:1198)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:401)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:284)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.getTempQw(QRFirstStep.java:263)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.flushSolver(QRFirstStep.java:104)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.map(QRFirstStep.java:175)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.collect(QRFirstStep.java:279)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:142)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:71)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>         at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> {quote}
> Also seeing the following tests fail:
> {quote}
> Tests in error: 
>   testSSVDSolverSparse(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
>   testSSVDSolverDense(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
> {quote}
> I haven't checked all of them, but I suspect they are all due to the same reason.  We should dynamically create a temp area for each test using temporary directories under the main temp dir.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-814) SSVD local tests should use their own tmp space to avoid collisions

Posted by "Dmitriy Lyubimov (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13108875#comment-13108875 ] 

Dmitriy Lyubimov commented on MAHOUT-814:
-----------------------------------------

or timestamp approach may work too, of course.

> SSVD local tests should use their own tmp space to avoid collisions
> -------------------------------------------------------------------
>
>                 Key: MAHOUT-814
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-814
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.5
>            Reporter: Grant Ingersoll
>            Assignee: Dmitriy Lyubimov
>            Priority: Minor
>             Fix For: 0.6
>
>         Attachments: MAHOUT-814.patch
>
>
> Running Mahout in an environment with Jenkins also running and am getting:
> {quote}
> java.io.FileNotFoundException: /tmp/q-temp.seq (Permission denied)
>         at java.io.FileOutputStream.open(Native Method)
>         at java.io.FileOutputStream.<init>(FileOutputStream.java:209)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:187)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:183)
>         at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:241)
>         at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:335)
>         at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:368)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:528)
>         at org.apache.hadoop.io.SequenceFile$BlockCompressWriter.<init>(SequenceFile.java:1198)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:401)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:284)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.getTempQw(QRFirstStep.java:263)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.flushSolver(QRFirstStep.java:104)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.map(QRFirstStep.java:175)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.collect(QRFirstStep.java:279)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:142)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:71)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>         at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> {quote}
> Also seeing the following tests fail:
> {quote}
> Tests in error: 
>   testSSVDSolverSparse(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
>   testSSVDSolverDense(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
> {quote}
> I haven't checked all of them, but I suspect they are all due to the same reason.  We should dynamically create a temp area for each test using temporary directories under the main temp dir.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (MAHOUT-814) SSVD local tests should use their own tmp space to avoid collisions

Posted by "Dmitriy Lyubimov (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13108858#comment-13108858 ] 

Dmitriy Lyubimov edited comment on MAHOUT-814 at 9/20/11 5:58 PM:
------------------------------------------------------------------

Actually, from what i am reading, even java.io.tmpdir may be shared between different tasks even in distributed mode in case jvm sharing is enabled (forget the property name, but i use it also at value of ~5-10 to speed up small tasks setup). 

In which case using java.io.tmp/q-temp may potentially be a problem as well. 

+What i propose is it probably would be safer to use a directory *$java.io.tmp/$taskAttemptID* which would guarantee task uniqueness (but perhaps not with local tests, if it is not unique with local tests, i will add some random numbers).+

{panel}
A little background: having to write a temporary file in this task is a corner case only arising when Q block height is smaller than the number of input rows of A coming in, which should never be the case with normal block sizes but may be a case with minSplitSize splits set at 1G or something, or if A input is extremely sparse (such as one non-zero element per row on average, then yeah, Q blocks, which are k+p wide (the number of eigenvalues requested), could be rather memory intensive, which is not a very good use case for this method, i'd rather try to transpose first to see if it helps row-wise sparsity). 

Only with those corner cases out-of-core version of task's local computation of what i call Q-hat blocks is used. But that pass is always sequential so it should capitalize on sequential read speeds and potentially OS I/O cache if there's enough memory installed.

The test however is set up intentionally the way that Q block height is set extremely small to test both blocking within a split and among the splits.
{panel}

      was (Author: dlyubimov):
    Actually, from what i am reading, even java.io.tmpdir may be shared between different tasks even in distributed mode in case jvm sharing is enabled (forget the property name, but i use it also at value of ~5-10 to speed up small tasks setup). 

In which case using java.io.tmp/q-temp may potentially be a problem as well. 

+What i propose is it probably would be safer to use a directory *$java.io.tmp/$taskAttemptID* which would guarantee task uniqueness (but perhaps not with local tests, if it is not unique with local tests, i will add some random numbers).+

{panel}
A little background: having to write a temporary file in this task is a corner case only arising when Q block height is smaller than the number of input rows of A coming in, which should never be the case with normal block sizes but may be a case with minSplitSize splits set at 1G or something, or if A input is extremely sparse (such as one non-zero element per row on average, then yeah, Q blocks, which are k+p wide (the number of eigenvalues requested), could be rather memory intensive, which is not a very good use case for this method, i'd rather try to transpose first to see if it helps row-wise sparsity). 

The test however is set up intentionally the way that Q block height is set extremely small to test both blocking within a split and among the splits.
{panel}
  
> SSVD local tests should use their own tmp space to avoid collisions
> -------------------------------------------------------------------
>
>                 Key: MAHOUT-814
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-814
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.5
>            Reporter: Grant Ingersoll
>            Assignee: Dmitriy Lyubimov
>            Priority: Minor
>             Fix For: 0.6
>
>         Attachments: MAHOUT-814.patch
>
>
> Running Mahout in an environment with Jenkins also running and am getting:
> {quote}
> java.io.FileNotFoundException: /tmp/q-temp.seq (Permission denied)
>         at java.io.FileOutputStream.open(Native Method)
>         at java.io.FileOutputStream.<init>(FileOutputStream.java:209)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:187)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:183)
>         at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:241)
>         at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:335)
>         at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:368)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:528)
>         at org.apache.hadoop.io.SequenceFile$BlockCompressWriter.<init>(SequenceFile.java:1198)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:401)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:284)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.getTempQw(QRFirstStep.java:263)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.flushSolver(QRFirstStep.java:104)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.map(QRFirstStep.java:175)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.collect(QRFirstStep.java:279)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:142)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:71)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>         at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> {quote}
> Also seeing the following tests fail:
> {quote}
> Tests in error: 
>   testSSVDSolverSparse(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
>   testSSVDSolverDense(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
> {quote}
> I haven't checked all of them, but I suspect they are all due to the same reason.  We should dynamically create a temp area for each test using temporary directories under the main temp dir.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Assigned] (MAHOUT-814) SSVD local tests should use their own tmp space to avoid collisions

Posted by "Dmitriy Lyubimov (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dmitriy Lyubimov reassigned MAHOUT-814:
---------------------------------------

    Assignee: Dmitriy Lyubimov

> SSVD local tests should use their own tmp space to avoid collisions
> -------------------------------------------------------------------
>
>                 Key: MAHOUT-814
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-814
>             Project: Mahout
>          Issue Type: Bug
>            Reporter: Grant Ingersoll
>            Assignee: Dmitriy Lyubimov
>            Priority: Minor
>
> Running Mahout in an environment with Jenkins also running and am getting:
> {quote}
> java.io.FileNotFoundException: /tmp/q-temp.seq (Permission denied)
>         at java.io.FileOutputStream.open(Native Method)
>         at java.io.FileOutputStream.<init>(FileOutputStream.java:209)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:187)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:183)
>         at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:241)
>         at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:335)
>         at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:368)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:528)
>         at org.apache.hadoop.io.SequenceFile$BlockCompressWriter.<init>(SequenceFile.java:1198)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:401)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:284)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.getTempQw(QRFirstStep.java:263)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.flushSolver(QRFirstStep.java:104)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.map(QRFirstStep.java:175)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.collect(QRFirstStep.java:279)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:142)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:71)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>         at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> {quote}
> Also seeing the following tests fail:
> {quote}
> Tests in error: 
>   testSSVDSolverSparse(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
>   testSSVDSolverDense(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
> {quote}
> I haven't checked all of them, but I suspect they are all due to the same reason.  We should dynamically create a temp area for each test using temporary directories under the main temp dir.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAHOUT-814) SSVD local tests should use their own tmp space to avoid collisions

Posted by "Dmitriy Lyubimov (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dmitriy Lyubimov updated MAHOUT-814:
------------------------------------

    Attachment: MAHOUT-814-1.patch

Implementation per Grant's suggestion. 

BTW, part of the suggestion (cleaning out task temp files) has already been implemented before. And that actually happens on cleanup() of the mapper.

I am still kind of dubious of the reason of the exception. looks like permission issues on java.io.tmpdir to me.



> SSVD local tests should use their own tmp space to avoid collisions
> -------------------------------------------------------------------
>
>                 Key: MAHOUT-814
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-814
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.5
>            Reporter: Grant Ingersoll
>            Assignee: Dmitriy Lyubimov
>            Priority: Minor
>             Fix For: 0.6
>
>         Attachments: MAHOUT-814-1.patch, MAHOUT-814.patch
>
>
> Running Mahout in an environment with Jenkins also running and am getting:
> {quote}
> java.io.FileNotFoundException: /tmp/q-temp.seq (Permission denied)
>         at java.io.FileOutputStream.open(Native Method)
>         at java.io.FileOutputStream.<init>(FileOutputStream.java:209)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:187)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:183)
>         at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:241)
>         at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:335)
>         at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:368)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:528)
>         at org.apache.hadoop.io.SequenceFile$BlockCompressWriter.<init>(SequenceFile.java:1198)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:401)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:284)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.getTempQw(QRFirstStep.java:263)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.flushSolver(QRFirstStep.java:104)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.map(QRFirstStep.java:175)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.collect(QRFirstStep.java:279)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:142)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:71)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>         at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> {quote}
> Also seeing the following tests fail:
> {quote}
> Tests in error: 
>   testSSVDSolverSparse(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
>   testSSVDSolverDense(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
> {quote}
> I haven't checked all of them, but I suspect they are all due to the same reason.  We should dynamically create a temp area for each test using temporary directories under the main temp dir.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-814) SSVD local tests should use their own tmp space to avoid collisions

Posted by "Dmitriy Lyubimov (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13108303#comment-13108303 ] 

Dmitriy Lyubimov commented on MAHOUT-814:
-----------------------------------------

Ok, I actually don't know what the best way to fix this. 

Hadoop boxes tasks so that java.io.tmpdir points to the task's temporary directory in the mapred folder (in distributed mode). That's basically their contract for task temporary space. 

My idea was to hack that in the case of local mapred to point something more suitable with mahout. 

But that more suitable turns out to be... /tmp+something as in the following code: 
{code:title=MahoutTestCase.java}

  protected final File getTestTempDir() throws IOException {
    if (testTempDir == null) {
      String systemTmpDir = System.getProperty("java.io.tmpdir");
      long simpleRandomLong = (long) (Long.MAX_VALUE * Math.random());
      testTempDir = new File(systemTmpDir, "mahout-" + getClass().getSimpleName() + '-' + simpleRandomLong);
      if (!testTempDir.mkdir()) {
        throw new IOException("Could not create " + testTempDir);
      }
      testTempDir.deleteOnExit();
    }
    return testTempDir;
  }

{code}

So... it looks like Mahout's test framework is already hooked on that (which in its turn is deregulated by Mahout, so it points to /tmp perhaps in test mode already). 

So it looks like i cannot override java.io.tmpdir Mahout-wide because Mahout already attributes some meaning to that variable.

I don't immediately see the best solution here. 

i can probably change the solvers so that they don't necessarily write to the task's root folder but create another folder there, but that still doesn't guarantee absence of clashes during tests (because only getTestTempDir() would guarantee that).

So i would want to solicit some discussion here.


> SSVD local tests should use their own tmp space to avoid collisions
> -------------------------------------------------------------------
>
>                 Key: MAHOUT-814
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-814
>             Project: Mahout
>          Issue Type: Bug
>            Reporter: Grant Ingersoll
>            Assignee: Dmitriy Lyubimov
>            Priority: Minor
>
> Running Mahout in an environment with Jenkins also running and am getting:
> {quote}
> java.io.FileNotFoundException: /tmp/q-temp.seq (Permission denied)
>         at java.io.FileOutputStream.open(Native Method)
>         at java.io.FileOutputStream.<init>(FileOutputStream.java:209)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:187)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:183)
>         at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:241)
>         at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:335)
>         at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:368)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:528)
>         at org.apache.hadoop.io.SequenceFile$BlockCompressWriter.<init>(SequenceFile.java:1198)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:401)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:284)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.getTempQw(QRFirstStep.java:263)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.flushSolver(QRFirstStep.java:104)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.map(QRFirstStep.java:175)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.collect(QRFirstStep.java:279)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:142)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:71)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>         at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> {quote}
> Also seeing the following tests fail:
> {quote}
> Tests in error: 
>   testSSVDSolverSparse(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
>   testSSVDSolverDense(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
> {quote}
> I haven't checked all of them, but I suspect they are all due to the same reason.  We should dynamically create a temp area for each test using temporary directories under the main temp dir.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (MAHOUT-814) SSVD local tests should use their own tmp space to avoid collisions

Posted by "Dmitriy Lyubimov (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13108858#comment-13108858 ] 

Dmitriy Lyubimov edited comment on MAHOUT-814 at 9/20/11 5:51 PM:
------------------------------------------------------------------

Actually, from what i am reading, even java.io.tmpdir may be shared between different tasks even in distributed mode in case jvm sharing is enabled (forget the property name, but i use it also at value of ~5-10 to speed up small tasks setup). 

In which case using java.io.tmp/q-temp may potentially be a problem as well. 

+What i propose is it probably would be safer to use a directory *$java.io.tmp/$taskAttemptID* which would guarantee task uniqueness (but perhaps not with local tests, if it is not unique with local tests, i will add some random numbers).+

{panel}
A little background: having to write a temporary file in this task is a corner case only arising when Q block height is smaller than the number of input rows of A coming in, which should never be the case with normal block sizes but may be a case with minSplitSize splits set at 1G or something, or if A input is extremely sparse (such as one non-zero element per row on average, then yeah, Q blocks, which are k+p wide (the number of eigenvalues requested), could be rather memory intensive, which is not a very good use case for this method, i'd rather try to transpose first to see if it helps row-wise sparsity). 

The test however is set up intentionally the way that Q block height is set extremely small to test both blocking within a split and among the splits.
{panel}

      was (Author: dlyubimov):
    Actually, from what i am reading, even java.io.tmpdir may be shared between different tasks even in distributed mode in case jvm sharing is enabled (forget the property name, but i use it also at value of ~5-10 to speed up small tasks setup). 

In which case using java.io.tmp/q-temp may potentially be a problem as well. 

+What i propose is it probably would be safer to use a directory *$java.io.tmp/$taskAttemptID* which would guarantee task uniqueness (but perhaps not with local tests, if it is not unique with local tests, i will add some random numbers).+

{panel}
A little background: having to write a temporary file in this task is a corner case only arising when Q block height is smaller than the number of input rows of A coming in, which should never be the case with normal block sizes but may be a case with minSplitSize splits set at 1G or something, or if A input is extremely sparse (such as one non-zero element per row on average, then yeah, Q blocks, which are k+p wide (the number of eigenvalues requested), which is not a very good use case for this method, i'd rather try to transpose first to see if it helps row-wise sparsity). 

The test however is set up intentionally the way that Q block height is set extremely small to test both blocking within a split and among the splits.
{panel}
  
> SSVD local tests should use their own tmp space to avoid collisions
> -------------------------------------------------------------------
>
>                 Key: MAHOUT-814
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-814
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.5
>            Reporter: Grant Ingersoll
>            Assignee: Dmitriy Lyubimov
>            Priority: Minor
>             Fix For: 0.6
>
>         Attachments: MAHOUT-814.patch
>
>
> Running Mahout in an environment with Jenkins also running and am getting:
> {quote}
> java.io.FileNotFoundException: /tmp/q-temp.seq (Permission denied)
>         at java.io.FileOutputStream.open(Native Method)
>         at java.io.FileOutputStream.<init>(FileOutputStream.java:209)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:187)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:183)
>         at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:241)
>         at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:335)
>         at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:368)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:528)
>         at org.apache.hadoop.io.SequenceFile$BlockCompressWriter.<init>(SequenceFile.java:1198)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:401)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:284)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.getTempQw(QRFirstStep.java:263)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.flushSolver(QRFirstStep.java:104)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.map(QRFirstStep.java:175)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.collect(QRFirstStep.java:279)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:142)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:71)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>         at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> {quote}
> Also seeing the following tests fail:
> {quote}
> Tests in error: 
>   testSSVDSolverSparse(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
>   testSSVDSolverDense(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
> {quote}
> I haven't checked all of them, but I suspect they are all due to the same reason.  We should dynamically create a temp area for each test using temporary directories under the main temp dir.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-814) SSVD local tests should use their own tmp space to avoid collisions

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13108598#comment-13108598 ] 

Grant Ingersoll commented on MAHOUT-814:
----------------------------------------

+1 on Sean's option #2.  The only question then is do we want to make it a little more forceful in the cleanup of temp files?  Perhaps a deleteOnExit?  Not sure what that would mean in distributed mode (I would assume it's safe, but what happens if you are reusing the JVM?)

> SSVD local tests should use their own tmp space to avoid collisions
> -------------------------------------------------------------------
>
>                 Key: MAHOUT-814
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-814
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.5
>            Reporter: Grant Ingersoll
>            Assignee: Dmitriy Lyubimov
>            Priority: Minor
>             Fix For: 0.6
>
>         Attachments: MAHOUT-814.patch
>
>
> Running Mahout in an environment with Jenkins also running and am getting:
> {quote}
> java.io.FileNotFoundException: /tmp/q-temp.seq (Permission denied)
>         at java.io.FileOutputStream.open(Native Method)
>         at java.io.FileOutputStream.<init>(FileOutputStream.java:209)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:187)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:183)
>         at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:241)
>         at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:335)
>         at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:368)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:528)
>         at org.apache.hadoop.io.SequenceFile$BlockCompressWriter.<init>(SequenceFile.java:1198)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:401)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:284)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.getTempQw(QRFirstStep.java:263)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.flushSolver(QRFirstStep.java:104)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.map(QRFirstStep.java:175)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.collect(QRFirstStep.java:279)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:142)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:71)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>         at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> {quote}
> Also seeing the following tests fail:
> {quote}
> Tests in error: 
>   testSSVDSolverSparse(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
>   testSSVDSolverDense(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
> {quote}
> I haven't checked all of them, but I suspect they are all due to the same reason.  We should dynamically create a temp area for each test using temporary directories under the main temp dir.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-814) SSVD local tests should use their own tmp space to avoid collisions

Posted by "Dmitriy Lyubimov (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13108304#comment-13108304 ] 

Dmitriy Lyubimov commented on MAHOUT-814:
-----------------------------------------

Also -- I am not sure if I am construing the exception correctly in this sense -- the original stack trace doesn't look like there's a clash, but rather that in that Jenkins' environment, java.io.tmp is not writable by the process. which would be a problem regardless of what we do.

> SSVD local tests should use their own tmp space to avoid collisions
> -------------------------------------------------------------------
>
>                 Key: MAHOUT-814
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-814
>             Project: Mahout
>          Issue Type: Bug
>            Reporter: Grant Ingersoll
>            Assignee: Dmitriy Lyubimov
>            Priority: Minor
>
> Running Mahout in an environment with Jenkins also running and am getting:
> {quote}
> java.io.FileNotFoundException: /tmp/q-temp.seq (Permission denied)
>         at java.io.FileOutputStream.open(Native Method)
>         at java.io.FileOutputStream.<init>(FileOutputStream.java:209)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:187)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:183)
>         at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:241)
>         at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:335)
>         at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:368)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:528)
>         at org.apache.hadoop.io.SequenceFile$BlockCompressWriter.<init>(SequenceFile.java:1198)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:401)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:284)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.getTempQw(QRFirstStep.java:263)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.flushSolver(QRFirstStep.java:104)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.map(QRFirstStep.java:175)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.collect(QRFirstStep.java:279)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:142)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:71)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>         at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> {quote}
> Also seeing the following tests fail:
> {quote}
> Tests in error: 
>   testSSVDSolverSparse(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
>   testSSVDSolverDense(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
> {quote}
> I haven't checked all of them, but I suspect they are all due to the same reason.  We should dynamically create a temp area for each test using temporary directories under the main temp dir.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAHOUT-814) SSVD local tests should use their own tmp space to avoid collisions

Posted by "Dmitriy Lyubimov (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13107609#comment-13107609 ] 

Dmitriy Lyubimov commented on MAHOUT-814:
-----------------------------------------

it's actually will only manifest in test with Jenkins for some reason. There's actually nothing to be fixed (except for maybe the local mode of the mapreduce itself), and it will work exactly as it is supposed in distributed mode. 

but we can probably squeeze in a workaround into mahout abstract test class. Just another piece of black magic to plug a hole in local mapreduce mode.

I can take a look at it next week, but I am not sure how to verify it with Jenkins other than committing it first.

> SSVD local tests should use their own tmp space to avoid collisions
> -------------------------------------------------------------------
>
>                 Key: MAHOUT-814
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-814
>             Project: Mahout
>          Issue Type: Bug
>            Reporter: Grant Ingersoll
>            Priority: Minor
>
> Running Mahout in an environment with Jenkins also running and am getting:
> {quote}
> java.io.FileNotFoundException: /tmp/q-temp.seq (Permission denied)
>         at java.io.FileOutputStream.open(Native Method)
>         at java.io.FileOutputStream.<init>(FileOutputStream.java:209)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:187)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:183)
>         at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:241)
>         at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:335)
>         at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:368)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:528)
>         at org.apache.hadoop.io.SequenceFile$BlockCompressWriter.<init>(SequenceFile.java:1198)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:401)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:284)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.getTempQw(QRFirstStep.java:263)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.flushSolver(QRFirstStep.java:104)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.map(QRFirstStep.java:175)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.collect(QRFirstStep.java:279)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:142)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:71)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>         at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> {quote}
> Also seeing the following tests fail:
> {quote}
> Tests in error: 
>   testSSVDSolverSparse(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
>   testSSVDSolverDense(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
> {quote}
> I haven't checked all of them, but I suspect they are all due to the same reason.  We should dynamically create a temp area for each test using temporary directories under the main temp dir.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAHOUT-814) SSVD local tests should use their own tmp space to avoid collisions

Posted by "Dmitriy Lyubimov (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dmitriy Lyubimov updated MAHOUT-814:
------------------------------------

        Fix Version/s: 0.6
    Affects Version/s: 0.5
               Status: Patch Available  (was: Open)

> SSVD local tests should use their own tmp space to avoid collisions
> -------------------------------------------------------------------
>
>                 Key: MAHOUT-814
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-814
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.5
>            Reporter: Grant Ingersoll
>            Assignee: Dmitriy Lyubimov
>            Priority: Minor
>             Fix For: 0.6
>
>         Attachments: MAHOUT-814.patch
>
>
> Running Mahout in an environment with Jenkins also running and am getting:
> {quote}
> java.io.FileNotFoundException: /tmp/q-temp.seq (Permission denied)
>         at java.io.FileOutputStream.open(Native Method)
>         at java.io.FileOutputStream.<init>(FileOutputStream.java:209)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:187)
>         at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:183)
>         at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:241)
>         at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:335)
>         at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:368)
>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:528)
>         at org.apache.hadoop.io.SequenceFile$BlockCompressWriter.<init>(SequenceFile.java:1198)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:401)
>         at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:284)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.getTempQw(QRFirstStep.java:263)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.flushSolver(QRFirstStep.java:104)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.map(QRFirstStep.java:175)
>         at org.apache.mahout.math.hadoop.stochasticsvd.qr.QRFirstStep.collect(QRFirstStep.java:279)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:142)
>         at org.apache.mahout.math.hadoop.stochasticsvd.QJob$QMapper.map(QJob.java:71)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>         at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> {quote}
> Also seeing the following tests fail:
> {quote}
> Tests in error: 
>   testSSVDSolverSparse(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverSparseSequentialTest): Q job unsuccessful.
>   testSSVDSolverPowerIterations1(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
>   testSSVDSolverDense(org.apache.mahout.math.hadoop.stochasticsvd.LocalSSVDSolverDenseTest): Q job unsuccessful.
> {quote}
> I haven't checked all of them, but I suspect they are all due to the same reason.  We should dynamically create a temp area for each test using temporary directories under the main temp dir.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira