You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@opennlp.apache.org by "Martin Wiesner (Jira)" <ji...@apache.org> on 2023/09/01 14:46:00 UTC
[jira] [Comment Edited] (OPENNLP-1190) CONLL02 format
[ https://issues.apache.org/jira/browse/OPENNLP-1190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17761328#comment-17761328 ]
Martin Wiesner edited comment on OPENNLP-1190 at 9/1/23 2:45 PM:
-----------------------------------------------------------------
In 2023, [https://www.lsi.upc.es/~nlp/tools/nerc/nerc.html] yields a 404 for which reason the resource mentioned on the mailing list in 2014 is no longer available this way.
Alternatively, the URL
[https://www.lsi.upc.edu/~nlp/tools/nerc/nerc.html]
is working fine.
was (Author: mawiesne):
In 2023, [https://www.lsi.upc.es/~nlp/tools/nerc/nerc.html] yields a 404 for which reason the resource mentioned on the mailing list in 2014 is no longer available this way.
> CONLL02 format
> --------------
>
> Key: OPENNLP-1190
> URL: https://issues.apache.org/jira/browse/OPENNLP-1190
> Project: OpenNLP
> Issue Type: Bug
> Components: Formats
> Affects Versions: tools-1.5.3
> Reporter: Luca
> Priority: Major
> Original Estimate: 1h
> Remaining Estimate: 1h
>
> According to the documentation, the following should work
> bin/opennlp TokenNameFinderConverter conll02 -data esp.train -lang es -types per > es_corpus_train_persons.txt
> However currently it delivers error message since it expects 3 columns instead of 2 that are in the dataset.
> This is a bug, introduced at line 130 of opennlp.tools.formats.Conll02NameSampleStream.java where a length of 3 is imposed.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)