You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@opennlp.apache.org by "Tommaso Teofili (JIRA)" <ji...@apache.org> on 2017/12/16 09:52:00 UTC
[jira] [Updated] (OPENNLP-1169) WordVectorTable should reference
WVs by String
[ https://issues.apache.org/jira/browse/OPENNLP-1169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tommaso Teofili updated OPENNLP-1169:
-------------------------------------
Description:
{{WordVectorsTable}} API retrieves {{WordVector}} via {{CharSequence}} , this is suboptimal as implementors could store such WVs via an hash table (e.g. {{MapWordVectorsTable}}) and the value of {{CharSequence#toString}} is not guaranteed to be the stable.
Additionally it's more common to have words as Strings rather than CharSequences, being that more consistent with other OpenNLP APIs (e.g. {{Tokenizer}} ).
So {{WordVectorsTable}} should instead retrieve {{WordVector}}s using String.
was:
{{WordVectorsTable}} API retrieves {{WordVector}}s via {{CharSequence}}, this is suboptimal as implementors could store such WVs via an hash table (e.g. {{MapWordVectorsTable}}) and the value of {{CharSequence#toString}} is not guaranteed to be the stable.
Additionally it's more common to have words as Strings rather than CharSequences, being that more consistent with other OpenNLP APIs (e.g. {{Tokenizer}}).
So {{WordVectorsTable}} should instead retrieve {{WordVector}}s using String.
> WordVectorTable should reference WVs by String
> ----------------------------------------------
>
> Key: OPENNLP-1169
> URL: https://issues.apache.org/jira/browse/OPENNLP-1169
> Project: OpenNLP
> Issue Type: Bug
> Components: word vectors
> Reporter: Tommaso Teofili
> Assignee: Tommaso Teofili
> Fix For: 1.8.4
>
>
> {{WordVectorsTable}} API retrieves {{WordVector}} via {{CharSequence}} , this is suboptimal as implementors could store such WVs via an hash table (e.g. {{MapWordVectorsTable}}) and the value of {{CharSequence#toString}} is not guaranteed to be the stable.
> Additionally it's more common to have words as Strings rather than CharSequences, being that more consistent with other OpenNLP APIs (e.g. {{Tokenizer}} ).
> So {{WordVectorsTable}} should instead retrieve {{WordVector}}s using String.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)