You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Dmitriy Lyubimov (JIRA)" <ji...@apache.org> on 2014/01/23 19:32:39 UTC

[jira] [Assigned] (MAHOUT-1408) Distributed cache file matching bug while running SSVD in broadcast mode

     [ https://issues.apache.org/jira/browse/MAHOUT-1408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dmitriy Lyubimov reassigned MAHOUT-1408:
----------------------------------------

    Assignee: Dmitriy Lyubimov

> Distributed cache file matching bug while running SSVD in broadcast mode
> ------------------------------------------------------------------------
>
>                 Key: MAHOUT-1408
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1408
>             Project: Mahout
>          Issue Type: Bug
>          Components: Math
>    Affects Versions: 0.8
>            Reporter: Angad Singh
>            Assignee: Dmitriy Lyubimov
>            Priority: Minor
>         Attachments: BtJob.java.patch
>
>
> The error is:
> java.lang.IllegalArgumentException: Unexpected file name, unable to deduce partition #:file:/data/d1/mapred/local/taskTracker/distcache/434503979705629827_-1822139941_1047712745/nn.red.ua2.inmobi.com/user/rmcuser/oozie-oozi/0034272-140120102756143-oozie-oozi-W/inmobi-ssvd_mahout--java/java-launcher.jar
> 	at org.apache.mahout.math.hadoop.stochasticsvd.SSVDHelper$1.compare(SSVDHelper.java:154)
> 	at org.apache.mahout.math.hadoop.stochasticsvd.SSVDHelper$1.compare(SSVDHelper.java:1)
> 	at java.util.Arrays.mergeSort(Arrays.java:1270)
> 	at java.util.Arrays.mergeSort(Arrays.java:1281)
> 	at java.util.Arrays.mergeSort(Arrays.java:1281)
> 	at java.util.Arrays.sort(Arrays.java:1210)
> 	at org.apache.mahout.common.iterator.sequencefile.SequenceFileDirValueIterator.init(SequenceFileDirValueIterator.java:112)
> 	at org.apache.mahout.common.iterator.sequencefile.SequenceFileDirValueIterator.<init>(SequenceFileDirValueIterator.java:94)
> 	at org.apache.mahout.math.hadoop.stochasticsvd.BtJob$BtMapper.setup(BtJob.java:220)
> 	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
> 	at org.apache.hadoop.mapred.Child$4.run(Child.java:266)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:396)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1278)
> 	at org.apache.hadoop.mapred.Child.main(Child.java:260)
> The bug is @ https://github.com/apache/mahout/blob/trunk/core/src/main/java/org/apache/mahout/math/hadoop/stochasticsvd/BtJob.java, near line 220.
> and  @ https://github.com/apache/mahout/blob/trunk/core/src/main/java/org/apache/mahout/math/hadoop/stochasticsvd/SSVDHelper.java near line 144.
> SSVDHelper's PARTITION_COMPARATOR assumes all files in the distributed cache will have a particular pattern whereas we have jar files in our distributed cache which causes the above exception.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)