You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Bob Morris <mo...@gmail.com> on 2014/03/30 23:21:31 UTC

text dictionary errors from ClusterDumper

After running CanopyDriver.run on some 4 dimensional DenseVectors, I'm
using a handcrafted text dictionary passed to ClusterDumper  declared
as dictionary type text. The dictionary looks like this, with the
entry lines having dimension and feature name separated by tab:

4
0 recordedBy
1 recordNumber
2 eventDate
3 locality

ClusterDumper is throwing lots of lines like
14/03/30 16:51:08 ERROR clustering.AbstractClusterWriter: Dictionary
entry missing for 3
apparently for each cluster (and of course, for varying dimensions).

Is there a prohibition on using a text dictionary when the underlying
data are DenseVectors?  If not, what is causing this error?

Running mahout 0.9 under eclipse on Ubuntu 13.04

Thanks
--Bob

-- 
Robert A. Morris

Emeritus Professor  of Computer Science
UMASS-Boston
100 Morrissey Blvd
Boston, MA 02125-3390


Filtered Push Project
Harvard University Herbaria
Harvard University

email: morris.bob@gmail.com
web: http://efg.cs.umb.edu/
web: http://wiki.filteredpush.org
http://www.cs.umb.edu/~ram
===
The content of this communication is made entirely on my
own behalf and in no way should be deemed to express
official positions of The University of Massachusetts at Boston or
Harvard University.