You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@stanbol.apache.org by "Florent ANDRE (JIRA)" <ji...@apache.org> on 2011/02/14 12:59:57 UTC

[jira] Created: (STANBOL-85) Duplicate class EnhancementRDFUtils inside engines.entityTagging & engines.autotagging

Duplicate class EnhancementRDFUtils inside engines.entityTagging & engines.autotagging
--------------------------------------------------------------------------------------

                 Key: STANBOL-85
                 URL: https://issues.apache.org/jira/browse/STANBOL-85
             Project: Stanbol
          Issue Type: Bug
          Components: Enhancer
            Reporter: Florent ANDRE


$ diff entitytagging/src/main/java/org/apache/stanbol/enhancer/engines/entitytagging/impl/EnhancementRDFUtils.java autotagging/src/main/java/org/apache/stanbol/enhancer/engines/autotagging/impl/EnhancementRDFUtils.java 

17c17
< package org.apache.stanbol.enhancer.engines.entitytagging.impl;
---
> package org.apache.stanbol.enhancer.engines.autotagging.impl;
20d19
< import java.util.Iterator;
22,23d20
< import org.apache.clerezza.rdf.core.Language;
< import org.apache.clerezza.rdf.core.Literal;
28d24
< import org.apache.clerezza.rdf.core.impl.PlainLiteralImpl;
29a26
> import org.apache.stanbol.autotagging.TagInfo;
32,35d28
< import org.apache.stanbol.entityhub.servicesapi.model.Reference;
< import org.apache.stanbol.entityhub.servicesapi.model.Sign;
< import org.apache.stanbol.entityhub.servicesapi.model.Text;
< import org.apache.stanbol.entityhub.servicesapi.model.rdf.RdfResourceEnum;
40,46d32
< /**
<  * Utility taken form the engine.autotagging bundle and adapted from
<  * using TagInfo to {@link Sign}.
<  *
<  * @author Rupert Westenthaler
<  * @author ogrisel (original utility)
<  */
48a35,37
>     private EnhancementRDFUtils() {
>     }
> 
54c43
<      * @param entity the related entity
---
>      * @param tag the related entity
57,81c46
<             MGraph graph, UriRef contentItemId, Collection<NonLiteral> relatedEnhancements, Sign entity) {
<         //1. check if the returned Entity does has a label -> if not return null
<         //add labels (set only a single label. Use "en" if available!
<         Text label = null;
<         Iterator<Text> labels = entity.getRepresentation().getText(RDFS_LABEL.getUnicodeString());
<         while (labels.hasNext()) {
<             Text actLabel  = labels.next();
<             if(label == null){
<                 label = actLabel;
<             } else {
<                 if("en".equals(actLabel.getLanguage())){
<                     label = actLabel;
<                 }
<             }
<         }
<         if (label == null){
<             return null;
<         }
<         Literal literal;
<         if (label.getLanguage() == null){
<             literal = new PlainLiteralImpl(label.getText());
<         } else {
<             literal = new PlainLiteralImpl(label.getText(), new Language(label.getLanguage()));
<         }
<         //Now create the entityAnnotation
---
>             MGraph graph, UriRef contentItemId, Collection<NonLiteral> relatedEnhancements, TagInfo tag) {
86c51,52
<             graph.add(new TripleImpl(entityAnnotation, DC_RELATION, enhancement));
---
>             graph.add(new TripleImpl(entityAnnotation,
>                         DC_RELATION, enhancement));
88c54
<         UriRef entityUri = new UriRef(entity.getId());
---
>         UriRef entityUri = new UriRef(tag.getId());
90,104c56,60
<         graph.add(new TripleImpl(entityAnnotation, ENHANCER_ENTITY_REFERENCE, entityUri));
<         //add the label parsed above
<         graph.add(new TripleImpl(entityAnnotation, ENHANCER_ENTITY_LABEL, literal));
<         //TODO: add real confidence values!
<         // -> in case of SolrYards this will be a Lucene score and not within the range [0..1]
<         // -> in case of SPARQL there will be no score information at all.
<         Object score = entity.getRepresentation().getFirst(RdfResourceEnum.resultScore.getUri());
<         Double scoreValue = new Double(-1); //use -1 if no score is available!
<         if (score != null){
<             try {
<                 scoreValue = Double.valueOf(score.toString());
<             } catch (NumberFormatException e) {
<                 //ignore
<             }
<         }
---
>         graph.add(new TripleImpl(entityAnnotation,
>                 ENHANCER_ENTITY_REFERENCE, entityUri));
>         graph.add(new TripleImpl(entityAnnotation,
>                 ENHANCER_ENTITY_LABEL,
>                 literalFactory.createTypedLiteral(tag.getLabel())));
107,110c63,64
<                 literalFactory.createTypedLiteral(scoreValue)));
< 
<         Iterator<Reference> types = entity.getRepresentation().getReferences(RDF_TYPE.getUnicodeString());
<         while (types.hasNext()) {
---
>                 literalFactory.createTypedLiteral(tag.getConfidence())));
>         for (String entityType : tag.getType()) {
112c66
<                     ENHANCER_ENTITY_TYPE, new UriRef(types.next().getReference())));
---
>                     ENHANCER_ENTITY_TYPE, new UriRef(entityType)));
114,118d67
<         //TODO: for now add the information about this entity to the graph
<         // -> this might be replaced by some additional engine at the end
< //        RdfValueFactory rdfValueFactory = RdfValueFactory.getInstance();
< //        RdfRepresentation representation = rdfValueFactory.toRdfRepresentation(entity.getRepresentation());
< //        graph.addAll(representation.getRdfGraph());


-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (STANBOL-85) Duplicate class EnhancementRDFUtils inside engines.entityTagging & engines.autotagging

Posted by "Rupert Westenthaler (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/STANBOL-85?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13181343#comment-13181343 ] 

Rupert Westenthaler commented on STANBOL-85:
--------------------------------------------

Currently this utility is used in 3 places

* autotagger: This component is no longer used and was replaced by the NER and NamedEntityTaggingEngine
* NamedEntityTaggingEngine
* TaxonomyLinkingEngine: Engine is no longer used and was replaced by the KeywordLinkingEngine [1]

Still moving utilities to write EntityEnhancement based on a Entity data would be nice to have on a common place. However this can be done as part of introducing the new Stanbol Enhancement Structure (STANBOL-351)

[1] see at the end of http://incubator.apache.org/stanbol/docs/trunk/enhancer/engines/keywordlinkingengine.html for more information
                
> Duplicate class EnhancementRDFUtils inside engines.entityTagging & engines.autotagging
> --------------------------------------------------------------------------------------
>
>                 Key: STANBOL-85
>                 URL: https://issues.apache.org/jira/browse/STANBOL-85
>             Project: Stanbol
>          Issue Type: Bug
>          Components: Enhancer
>            Reporter: Florent ANDRE
>
> $ diff entitytagging/src/main/java/org/apache/stanbol/enhancer/engines/entitytagging/impl/EnhancementRDFUtils.java autotagging/src/main/java/org/apache/stanbol/enhancer/engines/autotagging/impl/EnhancementRDFUtils.java 
> 17c17
> < package org.apache.stanbol.enhancer.engines.entitytagging.impl;
> ---
> > package org.apache.stanbol.enhancer.engines.autotagging.impl;
> 20d19
> < import java.util.Iterator;
> 22,23d20
> < import org.apache.clerezza.rdf.core.Language;
> < import org.apache.clerezza.rdf.core.Literal;
> 28d24
> < import org.apache.clerezza.rdf.core.impl.PlainLiteralImpl;
> 29a26
> > import org.apache.stanbol.autotagging.TagInfo;
> 32,35d28
> < import org.apache.stanbol.entityhub.servicesapi.model.Reference;
> < import org.apache.stanbol.entityhub.servicesapi.model.Sign;
> < import org.apache.stanbol.entityhub.servicesapi.model.Text;
> < import org.apache.stanbol.entityhub.servicesapi.model.rdf.RdfResourceEnum;
> 40,46d32
> < /**
> <  * Utility taken form the engine.autotagging bundle and adapted from
> <  * using TagInfo to {@link Sign}.
> <  *
> <  * @author Rupert Westenthaler
> <  * @author ogrisel (original utility)
> <  */
> 48a35,37
> >     private EnhancementRDFUtils() {
> >     }
> > 
> 54c43
> <      * @param entity the related entity
> ---
> >      * @param tag the related entity
> 57,81c46
> <             MGraph graph, UriRef contentItemId, Collection<NonLiteral> relatedEnhancements, Sign entity) {
> <         //1. check if the returned Entity does has a label -> if not return null
> <         //add labels (set only a single label. Use "en" if available!
> <         Text label = null;
> <         Iterator<Text> labels = entity.getRepresentation().getText(RDFS_LABEL.getUnicodeString());
> <         while (labels.hasNext()) {
> <             Text actLabel  = labels.next();
> <             if(label == null){
> <                 label = actLabel;
> <             } else {
> <                 if("en".equals(actLabel.getLanguage())){
> <                     label = actLabel;
> <                 }
> <             }
> <         }
> <         if (label == null){
> <             return null;
> <         }
> <         Literal literal;
> <         if (label.getLanguage() == null){
> <             literal = new PlainLiteralImpl(label.getText());
> <         } else {
> <             literal = new PlainLiteralImpl(label.getText(), new Language(label.getLanguage()));
> <         }
> <         //Now create the entityAnnotation
> ---
> >             MGraph graph, UriRef contentItemId, Collection<NonLiteral> relatedEnhancements, TagInfo tag) {
> 86c51,52
> <             graph.add(new TripleImpl(entityAnnotation, DC_RELATION, enhancement));
> ---
> >             graph.add(new TripleImpl(entityAnnotation,
> >                         DC_RELATION, enhancement));
> 88c54
> <         UriRef entityUri = new UriRef(entity.getId());
> ---
> >         UriRef entityUri = new UriRef(tag.getId());
> 90,104c56,60
> <         graph.add(new TripleImpl(entityAnnotation, ENHANCER_ENTITY_REFERENCE, entityUri));
> <         //add the label parsed above
> <         graph.add(new TripleImpl(entityAnnotation, ENHANCER_ENTITY_LABEL, literal));
> <         //TODO: add real confidence values!
> <         // -> in case of SolrYards this will be a Lucene score and not within the range [0..1]
> <         // -> in case of SPARQL there will be no score information at all.
> <         Object score = entity.getRepresentation().getFirst(RdfResourceEnum.resultScore.getUri());
> <         Double scoreValue = new Double(-1); //use -1 if no score is available!
> <         if (score != null){
> <             try {
> <                 scoreValue = Double.valueOf(score.toString());
> <             } catch (NumberFormatException e) {
> <                 //ignore
> <             }
> <         }
> ---
> >         graph.add(new TripleImpl(entityAnnotation,
> >                 ENHANCER_ENTITY_REFERENCE, entityUri));
> >         graph.add(new TripleImpl(entityAnnotation,
> >                 ENHANCER_ENTITY_LABEL,
> >                 literalFactory.createTypedLiteral(tag.getLabel())));
> 107,110c63,64
> <                 literalFactory.createTypedLiteral(scoreValue)));
> < 
> <         Iterator<Reference> types = entity.getRepresentation().getReferences(RDF_TYPE.getUnicodeString());
> <         while (types.hasNext()) {
> ---
> >                 literalFactory.createTypedLiteral(tag.getConfidence())));
> >         for (String entityType : tag.getType()) {
> 112c66
> <                     ENHANCER_ENTITY_TYPE, new UriRef(types.next().getReference())));
> ---
> >                     ENHANCER_ENTITY_TYPE, new UriRef(entityType)));
> 114,118d67
> <         //TODO: for now add the information about this entity to the graph
> <         // -> this might be replaced by some additional engine at the end
> < //        RdfValueFactory rdfValueFactory = RdfValueFactory.getInstance();
> < //        RdfRepresentation representation = rdfValueFactory.toRdfRepresentation(entity.getRepresentation());
> < //        graph.addAll(representation.getRdfGraph());

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (STANBOL-85) Duplicate class EnhancementRDFUtils inside engines.entityTagging & engines.autotagging

Posted by "Rupert Westenthaler (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/STANBOL-85?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rupert Westenthaler resolved STANBOL-85.
----------------------------------------

    Resolution: Won't Fix

Marking this as Won't Fix because of the reasons provided by the above comment. Fell free to reopen if you do not agree that we can leave with this duplications until we switch to the Stanbol Enhancement Structure.
                
> Duplicate class EnhancementRDFUtils inside engines.entityTagging & engines.autotagging
> --------------------------------------------------------------------------------------
>
>                 Key: STANBOL-85
>                 URL: https://issues.apache.org/jira/browse/STANBOL-85
>             Project: Stanbol
>          Issue Type: Bug
>          Components: Enhancer
>            Reporter: Florent ANDRE
>
> $ diff entitytagging/src/main/java/org/apache/stanbol/enhancer/engines/entitytagging/impl/EnhancementRDFUtils.java autotagging/src/main/java/org/apache/stanbol/enhancer/engines/autotagging/impl/EnhancementRDFUtils.java 
> 17c17
> < package org.apache.stanbol.enhancer.engines.entitytagging.impl;
> ---
> > package org.apache.stanbol.enhancer.engines.autotagging.impl;
> 20d19
> < import java.util.Iterator;
> 22,23d20
> < import org.apache.clerezza.rdf.core.Language;
> < import org.apache.clerezza.rdf.core.Literal;
> 28d24
> < import org.apache.clerezza.rdf.core.impl.PlainLiteralImpl;
> 29a26
> > import org.apache.stanbol.autotagging.TagInfo;
> 32,35d28
> < import org.apache.stanbol.entityhub.servicesapi.model.Reference;
> < import org.apache.stanbol.entityhub.servicesapi.model.Sign;
> < import org.apache.stanbol.entityhub.servicesapi.model.Text;
> < import org.apache.stanbol.entityhub.servicesapi.model.rdf.RdfResourceEnum;
> 40,46d32
> < /**
> <  * Utility taken form the engine.autotagging bundle and adapted from
> <  * using TagInfo to {@link Sign}.
> <  *
> <  * @author Rupert Westenthaler
> <  * @author ogrisel (original utility)
> <  */
> 48a35,37
> >     private EnhancementRDFUtils() {
> >     }
> > 
> 54c43
> <      * @param entity the related entity
> ---
> >      * @param tag the related entity
> 57,81c46
> <             MGraph graph, UriRef contentItemId, Collection<NonLiteral> relatedEnhancements, Sign entity) {
> <         //1. check if the returned Entity does has a label -> if not return null
> <         //add labels (set only a single label. Use "en" if available!
> <         Text label = null;
> <         Iterator<Text> labels = entity.getRepresentation().getText(RDFS_LABEL.getUnicodeString());
> <         while (labels.hasNext()) {
> <             Text actLabel  = labels.next();
> <             if(label == null){
> <                 label = actLabel;
> <             } else {
> <                 if("en".equals(actLabel.getLanguage())){
> <                     label = actLabel;
> <                 }
> <             }
> <         }
> <         if (label == null){
> <             return null;
> <         }
> <         Literal literal;
> <         if (label.getLanguage() == null){
> <             literal = new PlainLiteralImpl(label.getText());
> <         } else {
> <             literal = new PlainLiteralImpl(label.getText(), new Language(label.getLanguage()));
> <         }
> <         //Now create the entityAnnotation
> ---
> >             MGraph graph, UriRef contentItemId, Collection<NonLiteral> relatedEnhancements, TagInfo tag) {
> 86c51,52
> <             graph.add(new TripleImpl(entityAnnotation, DC_RELATION, enhancement));
> ---
> >             graph.add(new TripleImpl(entityAnnotation,
> >                         DC_RELATION, enhancement));
> 88c54
> <         UriRef entityUri = new UriRef(entity.getId());
> ---
> >         UriRef entityUri = new UriRef(tag.getId());
> 90,104c56,60
> <         graph.add(new TripleImpl(entityAnnotation, ENHANCER_ENTITY_REFERENCE, entityUri));
> <         //add the label parsed above
> <         graph.add(new TripleImpl(entityAnnotation, ENHANCER_ENTITY_LABEL, literal));
> <         //TODO: add real confidence values!
> <         // -> in case of SolrYards this will be a Lucene score and not within the range [0..1]
> <         // -> in case of SPARQL there will be no score information at all.
> <         Object score = entity.getRepresentation().getFirst(RdfResourceEnum.resultScore.getUri());
> <         Double scoreValue = new Double(-1); //use -1 if no score is available!
> <         if (score != null){
> <             try {
> <                 scoreValue = Double.valueOf(score.toString());
> <             } catch (NumberFormatException e) {
> <                 //ignore
> <             }
> <         }
> ---
> >         graph.add(new TripleImpl(entityAnnotation,
> >                 ENHANCER_ENTITY_REFERENCE, entityUri));
> >         graph.add(new TripleImpl(entityAnnotation,
> >                 ENHANCER_ENTITY_LABEL,
> >                 literalFactory.createTypedLiteral(tag.getLabel())));
> 107,110c63,64
> <                 literalFactory.createTypedLiteral(scoreValue)));
> < 
> <         Iterator<Reference> types = entity.getRepresentation().getReferences(RDF_TYPE.getUnicodeString());
> <         while (types.hasNext()) {
> ---
> >                 literalFactory.createTypedLiteral(tag.getConfidence())));
> >         for (String entityType : tag.getType()) {
> 112c66
> <                     ENHANCER_ENTITY_TYPE, new UriRef(types.next().getReference())));
> ---
> >                     ENHANCER_ENTITY_TYPE, new UriRef(entityType)));
> 114,118d67
> <         //TODO: for now add the information about this entity to the graph
> <         // -> this might be replaced by some additional engine at the end
> < //        RdfValueFactory rdfValueFactory = RdfValueFactory.getInstance();
> < //        RdfRepresentation representation = rdfValueFactory.toRdfRepresentation(entity.getRepresentation());
> < //        graph.addAll(representation.getRdfGraph());

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira