You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by vybe3142 <vy...@gmail.com> on 2013/01/29 23:26:49 UTC

Classifying a single file after training the model

1. Index the training data that I've pre-classified manually . Then perform
training and testing. Everything works fine to this point



This seems to work (looking at the confusion matrix  even though these are
plan old text snippets as opposed to newsgroup text articles.

2. At this point, I want to classify individual files that are not part of
the training set. I've tried a bunch of things that don't seem to work.
For example, .. I try to invoke main() on  TestNewsGroups.java
<http://svn.apache.org/repos/asf/mahout/trunk/examples/src/main/java/org/apache/mahout/classifier/sgd/TestNewsGroups.java>  
with the args



and end up with an Exception


Any idea what I can do to fix this? Thanks



--
View this message in context: http://lucene.472066.n3.nabble.com/Classifying-a-single-file-after-training-the-model-tp4037203.html
Sent from the Mahout User List mailing list archive at Nabble.com.

Re: Classifying a single file after training the model

Posted by vy...@gmail.com.
Anyone? .. Help



On Jan 29, 2013, at 5:48 PM, vybe3142@gmail.com wrote:

> Sorry, the raw exception from my nabble post didn't come through in the email
> 
> Exception in thread "main" java.io.UTFDataFormatException: malformed input around byte 5 at java.io.DataInputStream.readUTF(DataInputStream.java:617) at java.io.DataInputStream.readUTF(DataInputStream.java:547) at org.apache.mahout.classifier.sgd.PolymorphicWritable.read(PolymorphicWritable.java:41) at org.apache.mahout.classifier.sgd.ModelSerializer.readBinary(ModelSerializer.java:69) at com.memonews.mahout.sentiment.TestNewsGroups.run(TestNewsGroups.java:67) at com.memonews.mahout.sentiment.TestNewsGroups.main(TestNewsGroups.java:59) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120
> 
> On Jan 29, 2013, at 5:14 PM, Suneel Marthi <su...@yahoo.com> wrote:
> 
>> What's the exception u r seeing?
>> 
>> 
>> 
>> 
>> ________________________________
>> From: vybe3142 <vy...@gmail.com>
>> To: mahout-user@lucene.apache.org 
>> Sent: Tuesday, January 29, 2013 5:26 PM
>> Subject: Classifying a single file after training the model
>> 
>> 1. Index the training data that I've pre-classified manually . Then perform
>> training and testing. Everything works fine to this point
>> 
>> 
>> 
>> This seems to work (looking at the confusion matrix  even though these are
>> plan old text snippets as opposed to newsgroup text articles.
>> 
>> 2. At this point, I want to classify individual files that are not part of
>> the training set. I've tried a bunch of things that don't seem to work.
>> For example, .. I try to invoke main() on  TestNewsGroups.java
>> <http://svn.apache.org/repos/asf/mahout/trunk/examples/src/main/java/org/apache/mahout/classifier/sgd/TestNewsGroups.java>  
>> with the args
>> 
>> 
>> 
>> and end up with an Exception
>> 
>> 
>> Any idea what I can do to fix this? Thanks
>> 
>> 
>> 
>> --
>> View this message in context: http://lucene.472066.n3.nabble.com/Classifying-a-single-file-after-training-the-model-tp4037203.html
>> Sent from the Mahout User List mailing list archive at Nabble.com.

Re: Classifying a single file after training the model

Posted by vy...@gmail.com.
Sorry, the raw exception from my nabble post didn't come through in the email

Exception in thread "main" java.io.UTFDataFormatException: malformed input around byte 5 at java.io.DataInputStream.readUTF(DataInputStream.java:617) at java.io.DataInputStream.readUTF(DataInputStream.java:547) at org.apache.mahout.classifier.sgd.PolymorphicWritable.read(PolymorphicWritable.java:41) at org.apache.mahout.classifier.sgd.ModelSerializer.readBinary(ModelSerializer.java:69) at com.memonews.mahout.sentiment.TestNewsGroups.run(TestNewsGroups.java:67) at com.memonews.mahout.sentiment.TestNewsGroups.main(TestNewsGroups.java:59) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120

On Jan 29, 2013, at 5:14 PM, Suneel Marthi <su...@yahoo.com> wrote:

> What's the exception u r seeing?
> 
> 
> 
> 
> ________________________________
> From: vybe3142 <vy...@gmail.com>
> To: mahout-user@lucene.apache.org 
> Sent: Tuesday, January 29, 2013 5:26 PM
> Subject: Classifying a single file after training the model
> 
> 1. Index the training data that I've pre-classified manually . Then perform
> training and testing. Everything works fine to this point
> 
> 
> 
> This seems to work (looking at the confusion matrix  even though these are
> plan old text snippets as opposed to newsgroup text articles.
> 
> 2. At this point, I want to classify individual files that are not part of
> the training set. I've tried a bunch of things that don't seem to work.
> For example, .. I try to invoke main() on  TestNewsGroups.java
> <http://svn.apache.org/repos/asf/mahout/trunk/examples/src/main/java/org/apache/mahout/classifier/sgd/TestNewsGroups.java>  
> with the args
> 
> 
> 
> and end up with an Exception
> 
> 
> Any idea what I can do to fix this? Thanks
> 
> 
> 
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Classifying-a-single-file-after-training-the-model-tp4037203.html
> Sent from the Mahout User List mailing list archive at Nabble.com.

Re: Classifying a single file after training the model

Posted by Suneel Marthi <su...@yahoo.com>.
What's the exception u r seeing?




________________________________
 From: vybe3142 <vy...@gmail.com>
To: mahout-user@lucene.apache.org 
Sent: Tuesday, January 29, 2013 5:26 PM
Subject: Classifying a single file after training the model
 
1. Index the training data that I've pre-classified manually . Then perform
training and testing. Everything works fine to this point



This seems to work (looking at the confusion matrix  even though these are
plan old text snippets as opposed to newsgroup text articles.

2. At this point, I want to classify individual files that are not part of
the training set. I've tried a bunch of things that don't seem to work.
For example, .. I try to invoke main() on  TestNewsGroups.java
<http://svn.apache.org/repos/asf/mahout/trunk/examples/src/main/java/org/apache/mahout/classifier/sgd/TestNewsGroups.java>  
with the args



and end up with an Exception


Any idea what I can do to fix this? Thanks



--
View this message in context: http://lucene.472066.n3.nabble.com/Classifying-a-single-file-after-training-the-model-tp4037203.html
Sent from the Mahout User List mailing list archive at Nabble.com.