You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Christopher Schindler <id...@hotmail.com> on 2013/11/17 20:57:45 UTC
createTermFrequencyVectors, Hadoop, cast error
Hi,
After proving for FuzzyKMeans clustering methods in CLI I'm now moving to a Java app.
I'm running into an issue I can't seem to get past.
Error I'm getting:
java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to org.apache.mahout.common.StringTuple
at org.apache.mahout.vectorizer.collocations.llr.CollocMapper.map(CollocMapper.java:41)
...
I understand the type issue being reported; any insights for the fix? Also, I'm not explicitly calling FSDataOutputStream as I believe that the new Path param that is in the mahout method is handling the stream out.
Here's how I'm calling the method:
<snip>
String luceneSequenceFile = "hdfs://<server>:50070/opt/mahout/lucene-seq/index";
String outputDir = "hdfs://<server>:50070/opt/mahout/fkmeans-newsClusters";
String vectorsOutput = "hdfs://<server>:50070/opt/mahout/fkmeans-newsVectorsOutput";
Configuration conf = new Configuration();
FileSystem fs = FileSystem.get(conf);
DictionaryVectorizer.createTermFrequencyVectors(
new Path(luceneSequenceFile),
new Path(outputDir),
vectorsOutput,
conf,
minSupport,
maxNGramSize,
minLLRValue,
normPower,
true,
reduceTasks,
chunkSize,
sequentialAccessOutput, false);
</snip>
TIA,
Chris