You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Sebastian Benthall <sb...@gmail.com> on 2012/03/19 02:35:21 UTC
IllegalArgumentExceptions from LDA
Hi all,
I'm trying to use Mahout for LDA but have been getting
IllegalArgumentExceptions like this: https://gist.github.com/2089285
I'm using this script that is based on the cluster-reuters.sh example:
https://gist.github.com/2088888
I've poked around a bit without success but maybe this warning is an
indication of what's wrong:
WARN lda.LDADriver: can't determine number of words; no vectors in
mahout-work-hduser/toy-seqdir-sparse-lda/tf-vectors
Prior to that, I see that I get this message:
INFO common.HadoopUtil: Deleting
mahout-work-hduser/toy-seqdir-sparse-lda/tf-vectors
which I guess explains why there would be no vectors in that directory.
Is this expected behavior?
The only other thing I can think of is based on the comment of this method
that seems to be generating the warning.
http://mail-archives.apache.org/mod_mbox/mahout-commits/201109.mbox/%3C20110930114400.9DC732388A02@eris.apache.org%3E
Is the problem that I don't have enough documents? I was using a toy data
set before, and have been throwing other things into my document directory
to fill it out. It hasn't solved the problem.
Thanks in advance,
Seb