You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Grant Ingersoll (Commented) (JIRA)" <ji...@apache.org> on 2011/11/01 17:59:32 UTC
[jira] [Commented] (MAHOUT-857) Rework 20 NewsGroup shell script
example to include SGD Example
[ https://issues.apache.org/jira/browse/MAHOUT-857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13141321#comment-13141321 ]
Grant Ingersoll commented on MAHOUT-857:
----------------------------------------
Here's the conf. matrix I'm getting, which clearly points to some idiocy on my part:
{quote}
7532 test files
=======================================================
Summary
-------------------------------------------------------
Correctly Classified Instances : 374 4.9655%
Incorrectly Classified Instances : 7158 95.0345%
Total Classified Instances : 7532
=======================================================
Confusion Matrix
-------------------------------------------------------
a b c d e f g h i j k l m n o p q r s t u <--Classified as
123 0 1 1 1 2 6 19 2 2 5 23 27 8 53 3 14 17 12 0 0 | 319 a = alt.atheism
55 16 28 14 80 24 3 8 4 3 8 86 27 28 0 2 3 0 0 0 0 | 389 b = comp.graphics
38 171 57 14 49 5 3 6 2 4 3 25 7 6 1 1 0 2 0 0 0 | 394 c = comp.os.ms-windows.misc
10 14 237 18 17 15 2 7 4 0 2 54 7 4 0 0 0 1 0 0 0 | 392 d = comp.sys.ibm.pc.hardware
20 10 55 159 17 20 7 11 5 0 1 63 13 2 0 1 0 1 0 0 0 | 385 e = comp.sys.mac.hardware
11 25 5 0 306 13 3 1 0 5 2 13 5 6 0 0 0 0 0 0 0 | 395 f = comp.windows.x
2 1 23 14 6 310 1 3 3 1 1 10 6 5 0 3 0 1 0 0 0 | 390 g = misc.forsale
8 1 6 2 9 11 270 15 10 3 3 37 11 4 0 2 0 4 0 0 0 | 396 h = rec.autos
7 0 1 1 8 6 14 326 1 0 1 12 17 3 1 0 0 0 0 0 0 | 398 i = rec.motorcycles
17 1 2 1 2 5 2 7 295 26 1 16 12 2 0 2 3 3 0 0 0 | 397 j = rec.sport.baseball
6 1 0 0 1 3 3 6 55 291 1 7 4 14 2 4 1 0 0 0 0 | 399 k = rec.sport.hockey
22 2 0 3 5 3 0 3 2 1 293 24 12 7 0 4 2 13 0 0 0 | 396 l = sci.crypt
25 6 23 13 15 11 10 18 4 3 13 212 18 16 2 1 1 2 0 0 0 | 393 m = sci.electronics
14 4 5 2 5 7 2 17 7 3 0 38 268 11 4 3 4 2 0 0 0 | 396 n = sci.med
22 1 0 1 3 4 0 8 1 4 2 34 26 279 0 2 2 5 0 0 0 | 394 o = sci.space
43 1 2 4 0 4 1 11 4 1 0 9 33 8 249 2 5 14 7 0 0 | 398 p = soc.religion.christian
21 0 0 1 3 3 2 12 6 2 3 10 16 5 1 235 4 40 0 0 0 | 364 q = talk.politics.guns
41 0 0 2 1 1 5 3 3 7 0 10 12 5 1 8 250 27 0 0 0 | 376 r = talk.politics.mideast
34 0 0 1 2 4 3 16 2 1 5 14 12 6 4 67 8 131 0 0 0 | 310 s = talk.politics.misc
50 0 0 1 2 0 1 15 7 0 3 11 21 7 53 17 6 19 38 0 0 | 251 t = talk.religion.misc
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 | 0 u = DEFAULT
Default Category: DEFAULT: 20
{quote}
> Rework 20 NewsGroup shell script example to include SGD Example
> ---------------------------------------------------------------
>
> Key: MAHOUT-857
> URL: https://issues.apache.org/jira/browse/MAHOUT-857
> Project: Mahout
> Issue Type: Improvement
> Reporter: Grant Ingersoll
> Attachments: MAHOUT-857.patch
>
>
> We have build-20news-bayes.sh that runs our NB stuff on 20 news groups. We also have an SGD example that works on 20 news groups, but no script to run it. I'm going to rename build-20news-bayes.sh to classify-20news.sh and incorporate the two.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira