You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Valerio Ceraudo <va...@gmail.com> on 2010/09/05 17:54:08 UTC

how is the Vector format?

hi, I need to convert from Arff to Vector format to do a kmeans clustering.
I did some test and I saw that the first command uses:

bin/mahout seqdirectory

that creates a chunk-0 file from a folder with name.sgm files.
I know that inside mahout there is already an arff --> vector converter,but it
is just an experiment and it doesn't work because it misses some dependencies.
I want to make the converter from my own.
Where i can found information and specify about the .sgm or the Vector format?


Re: how is the Vector format?

Posted by Valerio Ceraudo <va...@gmail.com>.
> Any chance you might have some time to file a JIRA issue for that - or
> maybe even provide a patch that fixes the issue?
> 
> Isabel
> 
> 

I will submit a JIRA as i will finish the thesis, so not the next week, the
other one!
Valerio Ceraudo



Re: how is the Vector format?

Posted by Isabel Drost <is...@apache.org>.
On Sun, 5 Sep 2010 Valerio Ceraudo <va...@gmail.com> wrote:
> ok ok I can run your arffToVector in
> org.apache.utils.vectors.arff.Driver but i found a bug, it doesn't
> recognize the attribute REAL, so I changed the arff attributes in
> NUMERIC and it works,now I have got a iris.arff.MVC file.

Any chance you might have some time to file a JIRA issue for that - or
maybe even provide a patch that fixes the issue?

Isabel


Re: how is the Vector format?

Posted by Valerio Ceraudo <va...@gmail.com>.
Valerio Ceraudo <valerio.ceraudo <at> gmail.com> writes:

> 
> hi, I need to convert from Arff to Vector format to do a kmeans clustering.
> I did some test and I saw that the first command uses:
> 
> bin/mahout seqdirectory
> 
> that creates a chunk-0 file from a folder with name.sgm files.
> I know that inside mahout there is already an arff --> vector converter,but it
> is just an experiment and it doesn't work because it misses some dependencies.
> I want to make the converter from my own.
> Where i can found information and specify about the .sgm or the Vector format?
> 
> 

ok ok I can run your arffToVector in org.apache.utils.vectors.arff.Driver but i
found a bug, it doesn't recognize the attribute REAL, so I changed the arff
attributes in NUMERIC and it works,now I have got a iris.arff.MVC file.