You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jena.apache.org by "Osma Suominen (JIRA)" <ji...@apache.org> on 2016/05/02 09:34:12 UTC

[jira] [Updated] (JENA-1172) blank nodes can break jena-text

     [ https://issues.apache.org/jira/browse/JENA-1172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Osma Suominen updated JENA-1172:
--------------------------------
    Affects Version/s:     (was: Fuseki 2.3.1)
                       Jena 3.0.1

> blank nodes can break jena-text
> -------------------------------
>
>                 Key: JENA-1172
>                 URL: https://issues.apache.org/jira/browse/JENA-1172
>             Project: Apache Jena
>          Issue Type: Bug
>          Components: Text
>    Affects Versions: Jena 3.0.1
>            Reporter: Osma Suominen
>            Assignee: Osma Suominen
>
> Data with blank node subjects can break the jena-text index.
> For this example I use a typical jena-text configuration which indexes rdfs:label. Then I add this triple:
> {noformat}
> _:b0 <http://www.w3.org/2000/01/rdf-schema#label> "blank" .
> {noformat}
> There is no error (though I remember seeing WARNINGs in other situations like this) and the triple gets indexed.
> When I later execute this query:
> {noformat}
> PREFIX text: <http://jena.apache.org/text#>
> SELECT ?s { ?s text:query 'blank' }
> {noformat}
> I get this error:
> {noformat}
> 10:22:38 WARN  [5] RC = 500 : java.lang.UnsupportedOperationException: 3ed87b7f14f612ef53788d889f6410d6 is not a URI node
> org.apache.jena.ext.com.google.common.util.concurrent.UncheckedExecutionException: java.lang.UnsupportedOperationException: 3ed87b7f14f612ef53788d889f6410d6 is not a URI node
> 	at org.apache.jena.ext.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2203)
> 	at org.apache.jena.ext.com.google.common.cache.LocalCache.get(LocalCache.java:3937)
> 	at org.apache.jena.ext.com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4739)
> 	at org.apache.jena.atlas.lib.cache.CacheGuava.getOrFill(CacheGuava.java:58)
> 	at org.apache.jena.query.text.TextQueryPF.query(TextQueryPF.java:291)
> 	at org.apache.jena.query.text.TextQueryPF.variableSubject(TextQueryPF.java:229)
> 	at org.apache.jena.query.text.TextQueryPF.exec(TextQueryPF.java:198)
> 	at org.apache.jena.sparql.pfunction.PropertyFunctionBase$RepeatApplyIteratorPF.nextStage(PropertyFunctionBase.java:106)
> {noformat}
> Note that this happens any time the jena-text query happens to match a blank node subject. So a single triple with a blank node subject can "taint" the whole index. This is what happens with LCSH, which for whatever reason happens to contain a few hundred blank nodes that have a skos:prefLabel property (among almost 8M triples that generally use URIs for everything).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)