You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@opennlp.apache.org by Anthony Beylerian <an...@gmail.com> on 2015/09/01 07:05:56 UTC

GSoC - WSD component

Hello,

We have received the results concerning this year's GSoC.
I am glad we have passed the final evaluation !
I would really like to thank Jörn and Rodrigo's support during the program.
We have enjoyed the challenges and hope to contribute in the future.

Concerning the next steps, we are currently working on the packaging of
what is already available.
Among others, mostly improving the CLI support as well as the unit tests.

Otherwise, there is an interesting approach to enhance the performance
using domain-knowledge information [1].

In the example, OntoNotes and SemCor were used to obtain accuracy close to
90% coarse-grained.

Moreover, with the so-called "Augment" technique, it is possible to combine
specific domain-related information, to the training using general-domain
information.
This is useful, i.e. for the medical field (related to C-Takes) and is
expected to give better performance when domain knowledge is known.

I believe it would be possible to add support for this approach later on
since it only involves augmenting the feature space.

Regards,

Anthony

[1] : http://www.aclweb.org/anthology/D08-1105