You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by JAGANADH G <ja...@gmail.com> on 2012/10/11 08:44:21 UTC
Create vector from text
Hi All
As of mahout 0.7 a classifier takes vector for classification.
an anybody guide me how to create vector from text. I am not looking to
create vector from a file stored in HDFS or local file system.
In runtime my system will be recieving text input to perform classification.
Best regards
--
**********************************
JAGANADH G
http://jaganadhg.in
*ILUGCBE*
http://ilugcbe.org.in
Re: Create vector from text
Posted by JAGANADH G <ja...@gmail.com>.
On Thu, Oct 11, 2012 at 12:29 PM, Ted Dunning <te...@gmail.com> wrote:
> You have to tokenize your text and then use some form of vector encoding.
>
> If you have a known dictionary of all interesting words, you can simply
> make a vector as long as the number of words in your dictionary and put a 1
> in the right place.
>
> If you don't want to do that either because you don't know all the words in
> advance or because the number of words is too large, you can use
> a TextValueEncoder to do the deed. There is sample code in the Mahout in
> Action code for this and Chapter 14 in Mahout in Action talks about the
> code. You can get the code from http://github.com/tdunning/MiA
>
>
Hi Ted
Thanks for the pointer.
It works.
Sorry to shoot another question.
Is there any way get lable for classifier result as of 0.7 API
Best regards
--
**********************************
JAGANADH G
http://jaganadhg.in
*ILUGCBE*
http://ilugcbe.org.in
Re: Create vector from text
Posted by Ted Dunning <te...@gmail.com>.
You have to tokenize your text and then use some form of vector encoding.
If you have a known dictionary of all interesting words, you can simply
make a vector as long as the number of words in your dictionary and put a 1
in the right place.
If you don't want to do that either because you don't know all the words in
advance or because the number of words is too large, you can use
a TextValueEncoder to do the deed. There is sample code in the Mahout in
Action code for this and Chapter 14 in Mahout in Action talks about the
code. You can get the code from http://github.com/tdunning/MiA
On Wed, Oct 10, 2012 at 11:44 PM, JAGANADH G <ja...@gmail.com> wrote:
> Hi All
>
> As of mahout 0.7 a classifier takes vector for classification.
> an anybody guide me how to create vector from text. I am not looking to
> create vector from a file stored in HDFS or local file system.
> In runtime my system will be recieving text input to perform
> classification.
>
> Best regards
>
> --
> **********************************
> JAGANADH G
> http://jaganadhg.in
> *ILUGCBE*
> http://ilugcbe.org.in
>