You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@stanbol.apache.org by "Rupert Westenthaler (JIRA)" <ji...@apache.org> on 2013/08/03 17:47:48 UTC

[jira] [Comment Edited] (STANBOL-1144) NullPointerException during enhancement due to invalid TextAnnotations produced by NamedEntityTaggingEngine

    [ https://issues.apache.org/jira/browse/STANBOL-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13728568#comment-13728568 ] 

Rupert Westenthaler edited comment on STANBOL-1144 at 8/3/13 3:46 PM:
----------------------------------------------------------------------

HI Tom, if no other does have a look on it I will do it when I am back from vacation (12. August)
                
      was (Author: rwesten):
    HI Tom, I no other does have a look on it I will do it when I am back from vacation (12. August)
                  
> NullPointerException during enhancement due to invalid TextAnnotations produced by NamedEntityTaggingEngine
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: STANBOL-1144
>                 URL: https://issues.apache.org/jira/browse/STANBOL-1144
>             Project: Stanbol
>          Issue Type: Bug
>          Components: Enhancement Engines
>    Affects Versions: enhancement-engines-0.10.0
>         Environment: Any
>            Reporter: Tom Lomax
>            Assignee: Rupert Westenthaler
>            Priority: Critical
>              Labels: easyfix, patch, security
>         Attachments: NamedEntityFix.diff
>
>   Original Estimate: 0h
>  Remaining Estimate: 0h
>
> Some particular pieces of text, when sent to the enhancer, cause a crash in the Solr Yard code. e.g. trying to enhance the following using the standard default chain:
> Syrian regime!"
> (That's "Syrian regime" followed by an exclamation point and a double quote).
> ...results in a NullPointerException in the dbpediaLinking phase:
> Caused by: java.lang.NullPointerException
>     at org.apache.stanbol.entityhub.yard.solr.impl.SolrQueryFactory.initTextConstraint(SolrQueryFactory.java:415)
>     at org.apache.stanbol.entityhub.yard.solr.impl.SolrQueryFactory.createIndexConstraint(SolrQueryFactory.java:330)
>     at org.apache.stanbol.entityhub.yard.solr.impl.SolrQueryFactory.parseFieldQuery(SolrQueryFactory.java:235)
>     at org.apache.stanbol.entityhub.yard.solr.impl.SolrYard.find(SolrYard.java:267)
>     at org.apache.stanbol.entityhub.yard.solr.impl.SolrYard.findRepresentation(SolrYard.java:362)
>     at org.apache.stanbol.entityhub.core.site.CacheImpl.findRepresentation(CacheImpl.java:192)
>     at org.apache.stanbol.entityhub.core.impl.ReferencedSiteImpl.findEntities(ReferencedSiteImpl.java:151)
>     at org.apache.stanbol.enhancer.engines.entitytagging.impl.NamedEntityTaggingEngine.computeEntityRecommentations(NamedEntityTaggingEngine.java:505)
>     at org.apache.stanbol.enhancer.engines.entitytagging.impl.NamedEntityTaggingEngine.computeEnhancements(NamedEntityTaggingEngine.java:370)
>     at org.apache.stanbol.enhancer.jobmanager.event.impl.EnhancementJobHandler.processEvent(EnhancementJobHandler.java:271)
>     at org.apache.stanbol.enhancer.jobmanager.event.impl.EnhancementJobHandler.handleEvent(EnhancementJobHandler.java:189)
>     at org.apache.felix.eventadmin.impl.tasks.HandlerTaskImpl.execute(HandlerTaskImpl.java:88)
>     at org.apache.felix.eventadmin.impl.tasks.SyncDeliverTasks.execute(SyncDeliverTasks.java:221)
>     at org.apache.felix.eventadmin.impl.tasks.AsyncDeliverTasks$TaskExecuter.run(AsyncDeliverTasks.java:110)
>     at EDU.oswego.cs.dl.util.concurrent.PooledExecutor$Worker.run(Unknown Source)
>     at java.lang.Thread.run(Thread.java:724)
> I believe the cause of this is in enhancement-engines/entitytagging/src/main/java/org/apache/stanbol/enhancer/engines/entitytagging/impl/NamedEntity::createFromTextAnnotation, called by NamedEntityTaggingEngine::computeEnhancements, where it verifies the supplied entity name is not null or empty, but later removes punctuation and calls trim() again (the cleanupKeywords method), resulting in a possibility for an empty entity name to be included in the TextAnnotation being generated.
> Should be an easy fix - providing this is the correct place to fix it. I will attach a patch.
> I have reproduced this on clean installs of 0.10.0, and trunk as of this morning (r1509579).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira