You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Asitang Mishra (JIRA)" <ji...@apache.org> on 2015/10/12 05:25:05 UTC

[jira] [Created] (NUTCH-2136) Implement a different version of Naive Bayes Parse Filter

Asitang Mishra created NUTCH-2136:
-------------------------------------

             Summary: Implement a different version of Naive Bayes Parse Filter
                 Key: NUTCH-2136
                 URL: https://issues.apache.org/jira/browse/NUTCH-2136
             Project: Nutch
          Issue Type: Improvement
          Components: parser
            Reporter: Asitang Mishra
             Fix For: 1.10


There has been many dependency issues with the first implementation of Naive Bayes Parse Filter. The major dependencies were Mahout and Lucene. There was also the issue where the training process failed in the distributed mode due to the fact that  a nested hadoop job was unable to run on the cluster.
To remove all these issues and make the filter be able to run in a distributed environment I am going to implement my own version of Naive Bayes without any dependency on any machine learning libraries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)