You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by "Severance, Steve" <ss...@ebay.com> on 2010/09/04 02:39:43 UTC

Seq2Sparse Exception with nGrams

When I set nGrams to a number more than 1 (I have tried to and 3) I get the following exception.

Here Is my command line.
 ./mahout seq2sparse -i <input< -a org.apache.lucene.analysis.WhitespaceAnalyzer -o <output> -x 60 -wt TFIDF -ng 2 -ow

10/09/03 17:35:22 INFO mapred.JobClient: Task Id : attempt_201007221306_12175_m_000013_0, Status : FAILED
java.lang.NullPointerException
        at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:86)
        at java.io.DataOutputStream.write(DataOutputStream.java:90)
        at org.apache.mahout.utils.nlp.collocations.llr.Gram.write(Gram.java:181)
        at org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:90)
        at org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:77)
        at org.apache.hadoop.mapred.IFile$Writer.append(IFile.java:179)
        at org.apache.hadoop.mapred.Task$CombineOutputCollector.collect(Task.java:880)
        at org.apache.hadoop.mapred.Task$NewCombinerRunner$OutputConverter.write(Task.java:1197)
        at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
        at org.apache.mahout.utils.nlp.collocations.llr.CollocCombiner.reduce(CollocCombiner.java:40)
        at org.apache.mahout.utils.nlp.collocations.llr.CollocCombiner.reduce(CollocCombiner.java:25)
        at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
        at org.apache.hadoop.mapred.Task$NewCombinerRunner.combine(Task.java:1217)
        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1227)
        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1091)
        at org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:512)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:585)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
        at org.apache.hadoop.mapred.Child.main(Child.java:170)


Is this a known issue? nGrams worked with 0.3.

Thanks.

Steve