You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@opennlp.apache.org by "Tommaso Teofili (JIRA)" <ji...@apache.org> on 2017/12/16 09:52:00 UTC

[jira] [Updated] (OPENNLP-1169) WordVectorTable should reference WVs by String

     [ https://issues.apache.org/jira/browse/OPENNLP-1169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tommaso Teofili updated OPENNLP-1169:
-------------------------------------
    Description: 
{{WordVectorsTable}} API retrieves {{WordVector}} via {{CharSequence}} , this is suboptimal as implementors could store such WVs via an hash table (e.g. {{MapWordVectorsTable}}) and the value of {{CharSequence#toString}} is not guaranteed to be the stable.
Additionally it's more common to have words as Strings rather than CharSequences, being that more consistent with other OpenNLP APIs (e.g. {{Tokenizer}} ).

So {{WordVectorsTable}} should instead retrieve {{WordVector}}s using String.

  was:
{{WordVectorsTable}} API retrieves {{WordVector}}s via {{CharSequence}}, this is suboptimal as implementors could store such WVs via an hash table (e.g. {{MapWordVectorsTable}}) and the value of {{CharSequence#toString}} is not guaranteed to be the stable.
Additionally it's more common to have words as Strings rather than CharSequences, being that more consistent with other OpenNLP APIs (e.g. {{Tokenizer}}).

So {{WordVectorsTable}} should instead retrieve {{WordVector}}s using String.


> WordVectorTable should reference WVs by String
> ----------------------------------------------
>
>                 Key: OPENNLP-1169
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-1169
>             Project: OpenNLP
>          Issue Type: Bug
>          Components: word vectors
>            Reporter: Tommaso Teofili
>            Assignee: Tommaso Teofili
>             Fix For: 1.8.4
>
>
> {{WordVectorsTable}} API retrieves {{WordVector}} via {{CharSequence}} , this is suboptimal as implementors could store such WVs via an hash table (e.g. {{MapWordVectorsTable}}) and the value of {{CharSequence#toString}} is not guaranteed to be the stable.
> Additionally it's more common to have words as Strings rather than CharSequences, being that more consistent with other OpenNLP APIs (e.g. {{Tokenizer}} ).
> So {{WordVectorsTable}} should instead retrieve {{WordVector}}s using String.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)