You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@stanbol.apache.org by "Rupert Westenthaler (JIRA)" <ji...@apache.org> on 2013/10/15 10:37:43 UTC

[jira] [Comment Edited] (STANBOL-1132) Add co-reference resolution and dependency tree support in the Stanbol NLP processing API

    [ https://issues.apache.org/jira/browse/STANBOL-1132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13795000#comment-13795000 ] 

Rupert Westenthaler edited comment on STANBOL-1132 at 10/15/13 8:36 AM:
------------------------------------------------------------------------

Hi Cristian

Had a look at you patch and IMO it is very good quality. Because of this and because the patch only contains new features I decided to commit it directly to trunk. This should make further work more simple (no merges needed, you can use the default Stanbol Launcher for testing). However if you would prefer to continue in a branch just let me know and I will also apply the patch to the branch.

In the following a list of things I noticed while working through your patch:

1. Can you adapt the implementation of GrammaticalRelation to represent the hierarchy similar to the Pos enum. Using a static {} block to build up the transitive closure over parents instead of doing it in the #addParents() method. The getParents() method would then use the static  EnumMap<GrammaticalRelation,EnumSet<GrammaticalRelation>>(GrammaticalRelation.class) instead of a local HashMap. I would like to use EnumSet/-Map implementations as they provide much better performance in contrast to HashSet/-Map implementations.

2. Annotations should be immutable. So I would like to remove all add** set** ... methods from CorefTag, DependencyFeature and GramaticalReleationTag. In addition Getter methods should return immutable wrappers. The best would be to use Collections.unmodifiableSet(..) when setting fields in the Constructor. I added according comments to the CorefTag

3. IMO DependencyFeatures is obsolete. That's because the Annotated interface already supports multiple values and DependencyFeatures is just a collection of multiple DependencyRelations. So instead of having a DependencyFeatures class I would suggest to just add multiple DependencyRelations values. In detail I suggest to
    * rename DependencyRelation to DependencyRelationTag.
    * DependencyRelationTag extends Tag<DependencyRelationTag>
    * Change NlpAnnotations#DEPENDENCY_ANNOTATION to new Annotation<DependencyFeatures>(
			"stanbol.enhancer.nlp.dependency", DependencyRelationTag.class);
    * Simple add multiple DependencyRelationTag Values to a token. This has also the advantage the iterations over those are sorted by the confidence of those.
    * simple add a new DependencyRelationTag

Again thanks for the patch. It is great to getting Coref and dependendy tree support to the Stanbol NLP module!


was (Author: rwesten):
Hi Cristian

Had a look at you patch and IMO it is very good quality. Because of this and because the patch only contains new features I decided to commit it directly to trunk. This should make further work more simple (no merges needed, you can use the default Stanbol Launcher for testing). However if you would prefer to continue in a branch just let me know and I will also apply the patch to the branch.

In the following a list of things I noticed while working through your patch:

1 Can you adapt the implementation of GrammaticalRelation to represent the hierarchy similar to the Pos enum. Using a static {} block to build up the transitive closure over parents instead of doing it in the #addParents() method. The getParents() method would then use the static  EnumMap<GrammaticalRelation,EnumSet<GrammaticalRelation>>(GrammaticalRelation.class) instead of a local HashMap. I would like to use EnumSet/-Map implementations as they provide much better performance in contrast to HashSet/-Map implementations.

2. Annotations should be immutable. So I would like to remove all add** set** ... methods from CorefTag, DependencyFeature and GramaticalReleationTag. In addition Getter methods should return immutable wrappers. The best would be to use Collections.unmodifiableSet(..) when setting fields in the Constructor. I added according comments to the CorefTag

3. IMO DependencyFeatures is obsolete. That's because the Annotated interface already supports multiple values and DependencyFeatures is just a collection of multiple DependencyRelations. So instead of having a DependencyFeatures class I would suggest to just add multiple DependencyRelations values. In detail I suggest to
    * rename DependencyRelation to DependencyRelationTag.
    * DependencyRelationTag extends Tag<DependencyRelationTag>
    * Change NlpAnnotations#DEPENDENCY_ANNOTATION to new Annotation<DependencyFeatures>(
			"stanbol.enhancer.nlp.dependency", DependencyRelationTag.class);
    * Simple add multiple DependencyRelationTag Values to a token. This has also the advantage the iterations over those are sorted by the confidence of those.
    * simple add a new DependencyRelationTag

Again thanks for the patch. It is great to getting Coref and dependendy tree support to the Stanbol NLP module!

> Add co-reference resolution and dependency tree support in the Stanbol NLP processing API
> -----------------------------------------------------------------------------------------
>
>                 Key: STANBOL-1132
>                 URL: https://issues.apache.org/jira/browse/STANBOL-1132
>             Project: Stanbol
>          Issue Type: New Feature
>          Components: Enhancement Engines
>            Reporter: Cristian Petroaca
>            Assignee: Rupert Westenthaler
>              Labels: co-reference, dependency-tree, nlp
>         Attachments: coref_dependency_tree_datamodel.patch
>
>
> Extend the Stanbol NLP Processing API with annotations for co-reference resolution and dependency trees.
> Also, add support for JSON Serialisation/Parsing for the co-reference and dependency tree annotations so that the RESTful NLP Analysis Service can provide co-reference information.



--
This message was sent by Atlassian JIRA
(v6.1#6144)