You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by florie <io...@gmail.com> on 2011/05/10 21:23:39 UTC

Problems using ldatopics

Hi, after using lda, I am having problems with reading the output topics
using ldatopics:

The error is as follows:

[ion@lovemachine Downloads]$ mahout-0.4/bin/mahout ldatopics --input
sparsePosTokens/tf-vectors --dict sparsePosTokens/dictionary.file-0 --words
30 --output sparsePosTokens/topics --dictionaryType sequencefile
Running on hadoop, using HADOOP_HOME=hadoop
No HADOOP_CONF_DIR set, using hadoop/conf
Exception in thread "main" java.io.IOException: wrong value class: 0.0 is
not class org.apache.mahout.math.VectorWritable
    at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1874)
    at
org.apache.mahout.clustering.lda.LDAPrintTopics.topWordsForTopics(LDAPrintTopics.java:208)
    at
org.apache.mahout.clustering.lda.LDAPrintTopics.main(LDAPrintTopics.java:156)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
    at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
    at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:184)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
[ion@lovemachine Downloads]$ mahout-0.4/bin/mahout ldatopics --input
vectorSeqTokens/tf-vectors --dict vectorSeqTokens/dictionary.file-0 --words
30 --output sparsePosTokens/topics --dictionaryType sequencefile
Running on hadoop, using HADOOP_HOME=hadoop
No HADOOP_CONF_DIR set, using hadoop/conf
Exception in thread "main" java.io.IOException: wrong value class: 0.0 is
not class org.apache.mahout.math.VectorWritable
    at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1874)
    at
org.apache.mahout.clustering.lda.LDAPrintTopics.topWordsForTopics(LDAPrintTopics.java:208)
    at
org.apache.mahout.clustering.lda.LDAPrintTopics.main(LDAPrintTopics.java:156)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
    at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
    at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:184)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:156)


I can't seem to figure what the problem is. Any help is appreciated. Thank
you!
-- 
Ion Florie Ho
iho1@binghamton.edu
Dept. of Systems Science and Industrial Engineering
Binghamton University
4400 Vestal Parkway
Binghamton, NY 13902
Phone: (516) 587-0807

Re: Problems using ldatopics

Posted by Vasil Vasilev <va...@gmail.com>.
Hi Florie,

You should provide as an input to ldatopics the result of the last state
produced by lda, rather then the tf-vectors.
You can take a look at the script that clusters the Reuters data set
(located at examples/bin/build-reuters.sh in the Mahout source trunk)

Regards, Vasil

On Tue, May 10, 2011 at 10:23 PM, florie <io...@gmail.com> wrote:

> Hi, after using lda, I am having problems with reading the output topics
> using ldatopics:
>
> The error is as follows:
>
> [ion@lovemachine Downloads]$ mahout-0.4/bin/mahout ldatopics --input
> sparsePosTokens/tf-vectors --dict sparsePosTokens/dictionary.file-0 --words
> 30 --output sparsePosTokens/topics --dictionaryType sequencefile
> Running on hadoop, using HADOOP_HOME=hadoop
> No HADOOP_CONF_DIR set, using hadoop/conf
> Exception in thread "main" java.io.IOException: wrong value class: 0.0 is
> not class org.apache.mahout.math.VectorWritable
>    at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1874)
>    at
>
> org.apache.mahout.clustering.lda.LDAPrintTopics.topWordsForTopics(LDAPrintTopics.java:208)
>    at
>
> org.apache.mahout.clustering.lda.LDAPrintTopics.main(LDAPrintTopics.java:156)
>    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>    at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>    at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>    at java.lang.reflect.Method.invoke(Method.java:597)
>    at
>
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
>    at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>    at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:184)
>    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>    at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>    at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>    at java.lang.reflect.Method.invoke(Method.java:597)
>    at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> [ion@lovemachine Downloads]$ mahout-0.4/bin/mahout ldatopics --input
> vectorSeqTokens/tf-vectors --dict vectorSeqTokens/dictionary.file-0 --words
> 30 --output sparsePosTokens/topics --dictionaryType sequencefile
> Running on hadoop, using HADOOP_HOME=hadoop
> No HADOOP_CONF_DIR set, using hadoop/conf
> Exception in thread "main" java.io.IOException: wrong value class: 0.0 is
> not class org.apache.mahout.math.VectorWritable
>    at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1874)
>    at
>
> org.apache.mahout.clustering.lda.LDAPrintTopics.topWordsForTopics(LDAPrintTopics.java:208)
>    at
>
> org.apache.mahout.clustering.lda.LDAPrintTopics.main(LDAPrintTopics.java:156)
>    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>    at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>    at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>    at java.lang.reflect.Method.invoke(Method.java:597)
>    at
>
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
>    at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>    at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:184)
>    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>    at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>    at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>    at java.lang.reflect.Method.invoke(Method.java:597)
>    at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>
>
> I can't seem to figure what the problem is. Any help is appreciated. Thank
> you!
> --
> Ion Florie Ho
> iho1@binghamton.edu
> Dept. of Systems Science and Industrial Engineering
> Binghamton University
> 4400 Vestal Parkway
> Binghamton, NY 13902
> Phone: (516) 587-0807
>