You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@opennlp.apache.org by "Erik Andersson (Updated) (JIRA)" <ji...@apache.org> on 2012/03/31 21:54:25 UTC

[jira] [Updated] (OPENNLP-488) Doccat training tool throws NullPointer error

     [ https://issues.apache.org/jira/browse/OPENNLP-488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Erik Andersson updated OPENNLP-488:
-----------------------------------

    Attachment: en-doccat.train
    
> Doccat training tool throws NullPointer error
> ---------------------------------------------
>
>                 Key: OPENNLP-488
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-488
>             Project: OpenNLP
>          Issue Type: Bug
>          Components: Doccat
>         Environment: Using cygwin on Windows
> java version "1.6.0_27"
> Java(TM) SE Runtime Environment (build 1.6.0_27-b07)
> Java HotSpot(TM) Client VM (build 20.2-b06, mixed mode)
> apache-opennlp-1.5.2
>            Reporter: Erik Andersson
>         Attachments: en-doccat.train
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> When following the example in the OpenNLP 1.5.2 documentation I get a NullPointerException.
> http://opennlp.apache.org/documentation/1.5.2-incubating/manual/opennlp.html#tools.doccat.training.tool
> $ bin/opennlp DoccatTrainer -encoding UTF-8 -lang en -data en-doccat.train -model en-doccat.bin
> Indexing events using cutoff of 5
>         Computing event counts...  done. 2 events
>         Indexing...  Dropped event GMDecrease:[bow=Major, bow=acquisitions, bow=that, bow=have, bow=a, bow=lower, bow=gross, bow=margin, bow=than, bow=the, bow=existing, bow=network, bow=also, bow=had, bow=a, bow=negative, bow=impact, bow=on, bow=the, bow=overall, bow=gross, bow=margin,, bow=but, bow=it, bow=should, bow=improve, bow=following, bow=the, bow=implementation, bow=of, bow=its, bow=integration, bow=strategies, bow=.]
> Dropped event GMIncrease:[bow=The, bow=upward, bow=movement, bow=of, bow=gross, bow=margin, bow=resulted, bow=from, bow=amounts, bow=pursuant, bow=to, bow=adjustments, bow=to, bow=obligations, bow=towards, bow=dealers, bow=.]
> done.
> Sorting and merging events... Done indexing.
> Incorporating indexed data for training...
> Exception in thread "main" java.lang.NullPointerException
>         at opennlp.maxent.GISTrainer.trainModel(GISTrainer.java:263)
>         at opennlp.maxent.GIS.trainModel(GIS.java:256)
>         at opennlp.model.TrainUtil.train(TrainUtil.java:182)
>         at opennlp.tools.doccat.DocumentCategorizerME.train(DocumentCategorizerME.java:154)
>         at opennlp.tools.doccat.DocumentCategorizerME.train(DocumentCategorizerME.java:176)
>         at opennlp.tools.doccat.DocumentCategorizerME.train(DocumentCategorizerME.java:192)
>         at opennlp.tools.cmdline.doccat.DoccatTrainerTool.run(DoccatTrainerTool.java:91)
>         at opennlp.tools.cmdline.CLI.main(CLI.java:191)
> The file "en-doccat.train" is UTF-8 encoded in UNIX format and looks like this:
> GMDecrease  Major acquisitions that have a lower gross margin than the existing network also had a negative impact on the overall gross margin, but it should improve following the implementation of its integration strategies .
> GMIncrease  The upward movement of gross margin resulted from amounts pursuant to adjustments to obligations towards dealers .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira