You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Sameer Tilak <ss...@live.com> on 2013/12/27 22:14:00 UTC

itemsimilarity Exception in thread: java.io.FileNotFoundException: File does not exist: numUsers.bin

Hi All,I am having another issue with item similarity. For some reason numUsers.bin file does not get generated. I am copying the command  here:
./mahout itemsimilarity -i /scratch/SimilartyInput -o /scratch/SimilartyOutput --tempDir /scratch/Similartytemp -s SIMILARITY_COOCCURRENCE --maxSimilaritiesPerItem 10
The first MR job runs  and then at the end of it I see the following error:
13/12/27 12:56:57 INFO mapred.JobClient:  map 84% reduce 22%13/12/27 12:57:00 INFO mapred.JobClient:  map 86% reduce 22%13/12/27 12:57:05 INFO mapred.JobClient: Job complete: job_201311111627_043813/12/27 12:57:05 INFO mapred.JobClient: Counters: 2413/12/27 12:57:05 INFO mapred.JobClient:   Job Counters13/12/27 12:57:05 INFO mapred.JobClient:     Launched reduce tasks=113/12/27 12:57:05 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=31478113/12/27 12:57:05 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=013/12/27 12:57:05 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=013/12/27 12:57:05 INFO mapred.JobClient:     Rack-local map tasks=1213/12/27 12:57:05 INFO mapred.JobClient:     Launched map tasks=6113/12/27 12:57:05 INFO mapred.JobClient:     Data-local map tasks=4913/12/27 12:57:05 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=2706113/12/27 12:57:05 INFO mapred.JobClient:     Failed map tasks=113/12/27 12:57:05 INFO mapred.JobClient:   FileSystemCounters13/12/27 12:57:05 INFO mapred.JobClient:     HDFS_BYTES_READ=1927958413/12/27 12:57:05 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=131048013/12/27 12:57:05 INFO mapred.JobClient:   File Input Format Counters13/12/27 12:57:05 INFO mapred.JobClient:     Bytes Read=1927253413/12/27 12:57:05 INFO mapred.JobClient:   Map-Reduce Framework13/12/27 12:57:05 INFO mapred.JobClient:     Map output materialized bytes=18969013/12/27 12:57:05 INFO mapred.JobClient:     Combine output records=4308113/12/27 12:57:05 INFO mapred.JobClient:     Map input records=129447813/12/27 12:57:05 INFO mapred.JobClient:     Physical memory (bytes) snapshot=1712499916813/12/27 12:57:05 INFO mapred.JobClient:     Spilled Records=4308113/12/27 12:57:05 INFO mapred.JobClient:     Map output bytes=775525813/12/27 12:57:05 INFO mapred.JobClient:     CPU time spent (ms)=3754013/12/27 12:57:05 INFO mapred.JobClient:     Total committed heap usage (bytes)=1799616921613/12/27 12:57:05 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=12981121024013/12/27 12:57:05 INFO mapred.JobClient:     Combine input records=129447813/12/27 12:57:05 INFO mapred.JobClient:     Map output records=129447813/12/27 12:57:05 INFO mapred.JobClient:     SPLIT_RAW_BYTES=7050

Exception in thread "main" java.io.FileNotFoundException: File does not exist: /scratch/Similartytemp/prepareRatingMatrix/numUsers.bin	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1843)	at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.<init>(DFSClient.java:1834)	at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:578)	at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:154)	at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:427)	at org.apache.mahout.common.HadoopUtil.readInt(HadoopUtil.java:339)	at org.apache.mahout.cf.taste.hadoop.similarity.item.ItemSimilarityJob.run(ItemSimilarityJob.java:147)	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)	at org.apache.mahout.cf.taste.hadoop.similarity.item.ItemSimilarityJob.main(ItemSimilarityJob.java:93)	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)	at java.lang.reflect.Method.invoke(Method.java:601)	at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)	at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)	at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)	at java.lang.reflect.Method.invoke(Method.java:601)	at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
I checked the temp directory and here are its contents. I am not sure why the numUsers.bin file is not generated.
-bash-4.1$ hadoop dfs -ls /scratch/Similartytemp/Warning: $HADOOP_HOME is deprecated.
Found 1 itemsdrwxr-xr-x   - userid supergroup          0 2013-12-27 12:56 /scratch/Similartytemp/prepareRatingMatrix