You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Sam Cunningham <sa...@yahoo.com> on 2011/10/31 21:22:49 UTC

NaN - classification results (cbayes)

Using ClassifierDemo class provided below 

https://bitbucket.org/jaganadhg/blog/src/tip/bck9/java/src/org/bc/kl/ClassifierDemo.java 

I tested the 20news test data with the trained model. It worked fine.
However, when I ran the same class (ClassifierDemo) against my own test
dataset with my own trained model, I received the following messages. It
basically is returning NaN per potential class. And the final label it is
assigning the test dataset to is always the same label: "Health", which is
the default category, I think. Why is it doing this? I am pretty sure I
trained it and generated the model correctly. 

Oct 31, 2011 2:00:09 PM org.slf4j.impl.JCLLoggerAdapter info
INFO: Read 50000 feature weights
Oct 31, 2011 2:00:09 PM org.slf4j.impl.JCLLoggerAdapter info
INFO: Read 100000 feature weights
Oct 31, 2011 2:00:09 PM org.slf4j.impl.JCLLoggerAdapter info
INFO: Read 150000 feature weights
Oct 31, 2011 2:00:09 PM org.slf4j.impl.JCLLoggerAdapter info
INFO: 0.0
Oct 31, 2011 2:00:11 PM org.slf4j.impl.JCLLoggerAdapter info
INFO: Health NaN NaN NaN
Oct 31, 2011 2:00:11 PM org.slf4j.impl.JCLLoggerAdapter info
INFO: SciTech NaN NaN NaN
Oct 31, 2011 2:00:11 PM org.slf4j.impl.JCLLoggerAdapter info
INFO: General NaN NaN NaN
Oct 31, 2011 2:00:11 PM org.slf4j.impl.JCLLoggerAdapter info
INFO: Entertainment NaN NaN NaN
Oct 31, 2011 2:00:11 PM org.slf4j.impl.JCLLoggerAdapter info
INFO: Politics NaN NaN NaN
Oct 31, 2011 2:00:11 PM org.slf4j.impl.JCLLoggerAdapter info
INFO: Sports NaN NaN NaN
Oct 31, 2011 2:00:11 PM org.slf4j.impl.JCLLoggerAdapter info
INFO: Business NaN NaN NaN
Label: Health Score: 988.1363714476455

--
View this message in context: http://lucene.472066.n3.nabble.com/NaN-classification-results-cbayes-tp3468910p3468910.html
Sent from the Mahout User List mailing list archive at Nabble.com.

Re: NaN - classification results (cbayes)

Posted by Ted Dunning <te...@gmail.com>.
Sam,

I recommend actually subscribing to the mailing list while you have active
questions.  There is a long history of nabble postings not actually making
it to the apache mailing lists.

On Wed, Nov 2, 2011 at 12:19 PM, Sam Cunningham <sa...@yahoo.com>wrote:

> Below I am providing with some documents regarding the issue. The top 4
> documents are sample normalized classes (Entertainment, Health, SciTech,
> and
> Sports). The last document is the model.
>
> http://12.233.16.76/icons/Entertainment.zip
> http://12.233.16.76/icons/Health.zip
> http://12.233.16.76/icons/SciTech.zip
> http://12.233.16.76/icons/Sports.zip
> http://12.233.16.76/icons/articles-model.zip
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/NaN-classification-results-cbayes-tp3468910p3474879.html
> Sent from the Mahout User List mailing list archive at Nabble.com.
>

Re: NaN - classification results (cbayes)

Posted by Ted Dunning <te...@gmail.com>.
I can't download these files.  The server never responds as far as I can
tell.  You may have given out an local address.  Or turned the machine off.
 Or whatever.

Can you put them onto dropbox or pastebin or S3 or something so that we can
look at these?

On Wed, Nov 2, 2011 at 12:19 PM, Sam Cunningham <sa...@yahoo.com>wrote:

> Below I am providing with some documents regarding the issue. The top 4
> documents are sample normalized classes (Entertainment, Health, SciTech,
> and
> Sports). The last document is the model.
>
> http://12.233.16.76/icons/Entertainment.zip
> http://12.233.16.76/icons/Health.zip
> http://12.233.16.76/icons/SciTech.zip
> http://12.233.16.76/icons/Sports.zip
> http://12.233.16.76/icons/articles-model.zip
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/NaN-classification-results-cbayes-tp3468910p3474879.html
> Sent from the Mahout User List mailing list archive at Nabble.com.
>

Re: NaN - classification results (cbayes)

Posted by Sam Cunningham <sa...@yahoo.com>.
Here are the files:

http://lucene.472066.n3.nabble.com/file/n3480755/Entertainment.zip
Entertainment.zip 
http://lucene.472066.n3.nabble.com/file/n3480755/Health.zip Health.zip 
http://lucene.472066.n3.nabble.com/file/n3480755/SciTech.zip SciTech.zip 
http://lucene.472066.n3.nabble.com/file/n3480755/Sports.zip Sports.zip 
http://lucene.472066.n3.nabble.com/file/n3480755/articles-model.zip
articles-model.zip 

--
View this message in context: http://lucene.472066.n3.nabble.com/NaN-classification-results-cbayes-tp3468910p3480755.html
Sent from the Mahout User List mailing list archive at Nabble.com.

Re: NaN - classification results (cbayes)

Posted by Ted Dunning <te...@gmail.com>.
Ahh...

Each document should be on a separate line.  You appear to have
concatenated all your documents onto a single line.

On Wed, Nov 2, 2011 at 7:23 PM, Sam Cunningham <sa...@yahoo.com> wrote:

> It seems that some of us were not able to get to the URLs. So, I am
> uploading
> the files here.
>
> http://lucene.472066.n3.nabble.com/file/n3475998/Entertainment.zip
> Entertainment.zip
> http://lucene.472066.n3.nabble.com/file/n3475998/Health.zip Health.zip
> http://lucene.472066.n3.nabble.com/file/n3475998/SciTech.zip SciTech.zip
> http://lucene.472066.n3.nabble.com/file/n3475998/Sports.zip Sports.zip
> http://lucene.472066.n3.nabble.com/file/n3475998/articles-model.zip
> articles-model.zip
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/NaN-classification-results-cbayes-tp3468910p3475998.html
> Sent from the Mahout User List mailing list archive at Nabble.com.
>

Re: NaN - classification results (cbayes)

Posted by Sam Cunningham <sa...@yahoo.com>.
It seems that some of us were not able to get to the URLs. So, I am uploading
the files here.

http://lucene.472066.n3.nabble.com/file/n3475998/Entertainment.zip
Entertainment.zip 
http://lucene.472066.n3.nabble.com/file/n3475998/Health.zip Health.zip 
http://lucene.472066.n3.nabble.com/file/n3475998/SciTech.zip SciTech.zip 
http://lucene.472066.n3.nabble.com/file/n3475998/Sports.zip Sports.zip 
http://lucene.472066.n3.nabble.com/file/n3475998/articles-model.zip
articles-model.zip 

--
View this message in context: http://lucene.472066.n3.nabble.com/NaN-classification-results-cbayes-tp3468910p3475998.html
Sent from the Mahout User List mailing list archive at Nabble.com.

Re: NaN - classification results (cbayes)

Posted by Sam Cunningham <sa...@yahoo.com>.
Below I am providing with some documents regarding the issue. The top 4
documents are sample normalized classes (Entertainment, Health, SciTech, and
Sports). The last document is the model. 

http://12.233.16.76/icons/Entertainment.zip 
http://12.233.16.76/icons/Health.zip 
http://12.233.16.76/icons/SciTech.zip 
http://12.233.16.76/icons/Sports.zip 
http://12.233.16.76/icons/articles-model.zip 


--
View this message in context: http://lucene.472066.n3.nabble.com/NaN-classification-results-cbayes-tp3468910p3474879.html
Sent from the Mahout User List mailing list archive at Nabble.com.