You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@jena.apache.org by "Rob Vesse (Created) (JIRA)" <ji...@apache.org> on 2012/01/19 23:12:39 UTC

[jira] [Created] (JENA-199) Result Sets returned from TDB give empty output for Text, CSV and TSV formats for certain queries

Result Sets returned from TDB give empty output for Text, CSV and TSV formats for certain queries
-------------------------------------------------------------------------------------------------

                 Key: JENA-199
                 URL: https://issues.apache.org/jira/browse/JENA-199
             Project: Jena
          Issue Type: Bug
          Components: TDB
            Reporter: Rob Vesse
         Attachments: 5b.txt, 8.txt, sp2b10k.nt

This is a strange bug which I have been unable to reduce to a more minimal example than the files I will attach so I apologize for that.

Essentially the problem manifests as follows, when using a TDB dataset with Fuseki some queries will return blank output if the user requests Text, CSV or TSV.  When using XML/JSON the output is fine.

The test data used is SP2B 10k, two of the SP2B queries that exhibit this issue are as follows:

PREFIX rdf:   <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX foaf:  <http://xmlns.com/foaf/0.1/>
PREFIX bench: <http://localhost/vocabulary/bench/>
PREFIX dc:    <http://purl.org/dc/elements/1.1/>

SELECT DISTINCT ?person ?name
WHERE {
  ?article rdf:type bench:Article .
  ?article dc:creator ?person .
  ?inproc rdf:type bench:Inproceedings .
  ?inproc dc:creator ?person .
  ?person foaf:name ?name
}

And:

PREFIX xsd:  <http://www.w3.org/2001/XMLSchema#> 
PREFIX rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dc:   <http://purl.org/dc/elements/1.1/>

SELECT DISTINCT ?name
WHERE {
  ?erdoes rdf:type foaf:Person .
  ?erdoes foaf:name "Paul Erdoes"^^xsd:string .
  {
    ?document dc:creator ?erdoes .
    ?document dc:creator ?author .
    ?document2 dc:creator ?author .
    ?document2 dc:creator ?author2 .
    ?author2 foaf:name ?name
    FILTER (?author!=?erdoes &&
            ?document2!=?document &&
            ?author2!=?erdoes &&
            ?author2!=?author)
  } UNION {
    ?document dc:creator ?erdoes.
    ?document dc:creator ?author.
    ?author foaf:name ?name
    FILTER (?author!=?erdoes)
  }
}

I will attach these as files as well for convenience.

If you run Fuseki with a memory dataset using the --mem option, load this data and run the same queries the Text, CSV and TSV output works fine.  This implies that there is something in the TDB code related to its return of results or iterators which somehow causes the Text, CSV and TSV formatters to either error or to believe that they have no results to format.

I'm completely unfamiliar with the TDB codebase so I haven't attempted to discover what the cause of the issue is though I may poke around anyway

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (JENA-199) BindingBase can hit a null pointer exception on certain queries against a TDB dataset

Posted by "Andy Seaborne (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/JENA-199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13198899#comment-13198899 ] 

Andy Seaborne commented on JENA-199:
------------------------------------

Is it OK to close this as "can't reproduce"?  It can be reopened if new evidence comes to light.
                
> BindingBase can hit a null pointer exception on certain queries against a TDB dataset
> -------------------------------------------------------------------------------------
>
>                 Key: JENA-199
>                 URL: https://issues.apache.org/jira/browse/JENA-199
>             Project: Jena
>          Issue Type: Bug
>          Components: TDB
>            Reporter: Rob Vesse
>              Labels: csv, results, sparql, tdb, tsv
>         Attachments: 5b.txt, 8.txt, TDBEmptyOutput.java, sp2b10k.nt
>
>
> This is a strange bug which I have been unable to reduce to a more minimal example than the files I will attach so I apologize for that.
> Essentially the problem manifests as follows, when using a TDB dataset with Fuseki some queries will return blank output if the user requests Text, CSV or TSV.  When using XML/JSON the output is fine.
> The test data used is SP2B 10k, two of the SP2B queries that exhibit this issue are as follows:
> PREFIX rdf:   <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> PREFIX foaf:  <http://xmlns.com/foaf/0.1/>
> PREFIX bench: <http://localhost/vocabulary/bench/>
> PREFIX dc:    <http://purl.org/dc/elements/1.1/>
> SELECT DISTINCT ?person ?name
> WHERE {
>   ?article rdf:type bench:Article .
>   ?article dc:creator ?person .
>   ?inproc rdf:type bench:Inproceedings .
>   ?inproc dc:creator ?person .
>   ?person foaf:name ?name
> }
> And:
> PREFIX xsd:  <http://www.w3.org/2001/XMLSchema#> 
> PREFIX rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> PREFIX foaf: <http://xmlns.com/foaf/0.1/>
> PREFIX dc:   <http://purl.org/dc/elements/1.1/>
> SELECT DISTINCT ?name
> WHERE {
>   ?erdoes rdf:type foaf:Person .
>   ?erdoes foaf:name "Paul Erdoes"^^xsd:string .
>   {
>     ?document dc:creator ?erdoes .
>     ?document dc:creator ?author .
>     ?document2 dc:creator ?author .
>     ?document2 dc:creator ?author2 .
>     ?author2 foaf:name ?name
>     FILTER (?author!=?erdoes &&
>             ?document2!=?document &&
>             ?author2!=?erdoes &&
>             ?author2!=?author)
>   } UNION {
>     ?document dc:creator ?erdoes.
>     ?document dc:creator ?author.
>     ?author foaf:name ?name
>     FILTER (?author!=?erdoes)
>   }
> }
> I will attach these as files as well for convenience.
> If you run Fuseki with a memory dataset using the --mem option, load this data and run the same queries the Text, CSV and TSV output works fine.  This implies that there is something in the TDB code related to its return of results or iterators which somehow causes the Text, CSV and TSV formatters to either error or to believe that they have no results to format.
> I'm completely unfamiliar with the TDB codebase so I haven't attempted to discover what the cause of the issue is though I may poke around anyway

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (JENA-199) BindingBase can hit a null pointer exception on certain queries against a TDB dataset

Posted by "Rob Vesse (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/JENA-199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191536#comment-13191536 ] 

Rob Vesse commented on JENA-199:
--------------------------------

Yes I believe I did copy an existing dataset created with Fuseki 0.2.0 with TDB 0.8.0 so db corruption may be the root cause of this

Will try and see if I can reproduce on latest snapshots at some point in the next day or so
                
> BindingBase can hit a null pointer exception on certain queries against a TDB dataset
> -------------------------------------------------------------------------------------
>
>                 Key: JENA-199
>                 URL: https://issues.apache.org/jira/browse/JENA-199
>             Project: Jena
>          Issue Type: Bug
>          Components: TDB
>            Reporter: Rob Vesse
>              Labels: csv, results, sparql, tdb, tsv
>         Attachments: 5b.txt, 8.txt, TDBEmptyOutput.java, sp2b10k.nt
>
>
> This is a strange bug which I have been unable to reduce to a more minimal example than the files I will attach so I apologize for that.
> Essentially the problem manifests as follows, when using a TDB dataset with Fuseki some queries will return blank output if the user requests Text, CSV or TSV.  When using XML/JSON the output is fine.
> The test data used is SP2B 10k, two of the SP2B queries that exhibit this issue are as follows:
> PREFIX rdf:   <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> PREFIX foaf:  <http://xmlns.com/foaf/0.1/>
> PREFIX bench: <http://localhost/vocabulary/bench/>
> PREFIX dc:    <http://purl.org/dc/elements/1.1/>
> SELECT DISTINCT ?person ?name
> WHERE {
>   ?article rdf:type bench:Article .
>   ?article dc:creator ?person .
>   ?inproc rdf:type bench:Inproceedings .
>   ?inproc dc:creator ?person .
>   ?person foaf:name ?name
> }
> And:
> PREFIX xsd:  <http://www.w3.org/2001/XMLSchema#> 
> PREFIX rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> PREFIX foaf: <http://xmlns.com/foaf/0.1/>
> PREFIX dc:   <http://purl.org/dc/elements/1.1/>
> SELECT DISTINCT ?name
> WHERE {
>   ?erdoes rdf:type foaf:Person .
>   ?erdoes foaf:name "Paul Erdoes"^^xsd:string .
>   {
>     ?document dc:creator ?erdoes .
>     ?document dc:creator ?author .
>     ?document2 dc:creator ?author .
>     ?document2 dc:creator ?author2 .
>     ?author2 foaf:name ?name
>     FILTER (?author!=?erdoes &&
>             ?document2!=?document &&
>             ?author2!=?erdoes &&
>             ?author2!=?author)
>   } UNION {
>     ?document dc:creator ?erdoes.
>     ?document dc:creator ?author.
>     ?author foaf:name ?name
>     FILTER (?author!=?erdoes)
>   }
> }
> I will attach these as files as well for convenience.
> If you run Fuseki with a memory dataset using the --mem option, load this data and run the same queries the Text, CSV and TSV output works fine.  This implies that there is something in the TDB code related to its return of results or iterators which somehow causes the Text, CSV and TSV formatters to either error or to believe that they have no results to format.
> I'm completely unfamiliar with the TDB codebase so I haven't attempted to discover what the cause of the issue is though I may poke around anyway

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (JENA-199) BindingBase can hit a null pointer exception on certain queries against a TDB dataset

Posted by "Rob Vesse (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/JENA-199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13198994#comment-13198994 ] 

Rob Vesse commented on JENA-199:
--------------------------------

Sorry for taking so long to reply on this, with an updated Fuseki/TDB snapshot and with a cleanly built database I can no longer reproduce this so I'll close as Not a Bug or similar
                
> BindingBase can hit a null pointer exception on certain queries against a TDB dataset
> -------------------------------------------------------------------------------------
>
>                 Key: JENA-199
>                 URL: https://issues.apache.org/jira/browse/JENA-199
>             Project: Jena
>          Issue Type: Bug
>          Components: TDB
>            Reporter: Rob Vesse
>              Labels: csv, results, sparql, tdb, tsv
>         Attachments: 5b.txt, 8.txt, TDBEmptyOutput.java, sp2b10k.nt
>
>
> This is a strange bug which I have been unable to reduce to a more minimal example than the files I will attach so I apologize for that.
> Essentially the problem manifests as follows, when using a TDB dataset with Fuseki some queries will return blank output if the user requests Text, CSV or TSV.  When using XML/JSON the output is fine.
> The test data used is SP2B 10k, two of the SP2B queries that exhibit this issue are as follows:
> PREFIX rdf:   <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> PREFIX foaf:  <http://xmlns.com/foaf/0.1/>
> PREFIX bench: <http://localhost/vocabulary/bench/>
> PREFIX dc:    <http://purl.org/dc/elements/1.1/>
> SELECT DISTINCT ?person ?name
> WHERE {
>   ?article rdf:type bench:Article .
>   ?article dc:creator ?person .
>   ?inproc rdf:type bench:Inproceedings .
>   ?inproc dc:creator ?person .
>   ?person foaf:name ?name
> }
> And:
> PREFIX xsd:  <http://www.w3.org/2001/XMLSchema#> 
> PREFIX rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> PREFIX foaf: <http://xmlns.com/foaf/0.1/>
> PREFIX dc:   <http://purl.org/dc/elements/1.1/>
> SELECT DISTINCT ?name
> WHERE {
>   ?erdoes rdf:type foaf:Person .
>   ?erdoes foaf:name "Paul Erdoes"^^xsd:string .
>   {
>     ?document dc:creator ?erdoes .
>     ?document dc:creator ?author .
>     ?document2 dc:creator ?author .
>     ?document2 dc:creator ?author2 .
>     ?author2 foaf:name ?name
>     FILTER (?author!=?erdoes &&
>             ?document2!=?document &&
>             ?author2!=?erdoes &&
>             ?author2!=?author)
>   } UNION {
>     ?document dc:creator ?erdoes.
>     ?document dc:creator ?author.
>     ?author foaf:name ?name
>     FILTER (?author!=?erdoes)
>   }
> }
> I will attach these as files as well for convenience.
> If you run Fuseki with a memory dataset using the --mem option, load this data and run the same queries the Text, CSV and TSV output works fine.  This implies that there is something in the TDB code related to its return of results or iterators which somehow causes the Text, CSV and TSV formatters to either error or to believe that they have no results to format.
> I'm completely unfamiliar with the TDB codebase so I haven't attempted to discover what the cause of the issue is though I may poke around anyway

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (JENA-199) Result Sets returned from TDB give empty output for Text, CSV and TSV formats for certain queries

Posted by "Rob Vesse (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/JENA-199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13189885#comment-13189885 ] 

Rob Vesse commented on JENA-199:
--------------------------------

This is the latest Fuseki 0.2.1 Snapshot, --version output from Fuseki:

Jena: VERSION: 2.7.0-incubating
Jena: BUILD_DATE: 2011-12-14T14:54:09+0000
ARQ: VERSION: 2.9.0-incubating
ARQ: BUILD_DATE: 2011-12-14T15:04:27+0000
TDB: VERSION: 0.9.0-incubating-SNAPSHOT
TDB: BUILD_DATE: 2012-01-12T01:15:34+0000
Fuseki: VERSION: 0.2.1-incubating-SNAPSHOT
Fuseki: BUILD_DATE: 2012-01-12T20:02:39+0000

When I say the output is blank it's blank totally, so it fails to even show a header with the variable names for all of the TSV, CSV and Text outputs which is what I'd expect if there are just empty rows.

There are answers to the queries because the XML and JSON outputs show valid output with multiple non-empty result rows.
                
> Result Sets returned from TDB give empty output for Text, CSV and TSV formats for certain queries
> -------------------------------------------------------------------------------------------------
>
>                 Key: JENA-199
>                 URL: https://issues.apache.org/jira/browse/JENA-199
>             Project: Jena
>          Issue Type: Bug
>          Components: TDB
>            Reporter: Rob Vesse
>              Labels: csv, results, sparql, tdb, tsv
>         Attachments: 5b.txt, 8.txt, sp2b10k.nt
>
>
> This is a strange bug which I have been unable to reduce to a more minimal example than the files I will attach so I apologize for that.
> Essentially the problem manifests as follows, when using a TDB dataset with Fuseki some queries will return blank output if the user requests Text, CSV or TSV.  When using XML/JSON the output is fine.
> The test data used is SP2B 10k, two of the SP2B queries that exhibit this issue are as follows:
> PREFIX rdf:   <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> PREFIX foaf:  <http://xmlns.com/foaf/0.1/>
> PREFIX bench: <http://localhost/vocabulary/bench/>
> PREFIX dc:    <http://purl.org/dc/elements/1.1/>
> SELECT DISTINCT ?person ?name
> WHERE {
>   ?article rdf:type bench:Article .
>   ?article dc:creator ?person .
>   ?inproc rdf:type bench:Inproceedings .
>   ?inproc dc:creator ?person .
>   ?person foaf:name ?name
> }
> And:
> PREFIX xsd:  <http://www.w3.org/2001/XMLSchema#> 
> PREFIX rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> PREFIX foaf: <http://xmlns.com/foaf/0.1/>
> PREFIX dc:   <http://purl.org/dc/elements/1.1/>
> SELECT DISTINCT ?name
> WHERE {
>   ?erdoes rdf:type foaf:Person .
>   ?erdoes foaf:name "Paul Erdoes"^^xsd:string .
>   {
>     ?document dc:creator ?erdoes .
>     ?document dc:creator ?author .
>     ?document2 dc:creator ?author .
>     ?document2 dc:creator ?author2 .
>     ?author2 foaf:name ?name
>     FILTER (?author!=?erdoes &&
>             ?document2!=?document &&
>             ?author2!=?erdoes &&
>             ?author2!=?author)
>   } UNION {
>     ?document dc:creator ?erdoes.
>     ?document dc:creator ?author.
>     ?author foaf:name ?name
>     FILTER (?author!=?erdoes)
>   }
> }
> I will attach these as files as well for convenience.
> If you run Fuseki with a memory dataset using the --mem option, load this data and run the same queries the Text, CSV and TSV output works fine.  This implies that there is something in the TDB code related to its return of results or iterators which somehow causes the Text, CSV and TSV formatters to either error or to believe that they have no results to format.
> I'm completely unfamiliar with the TDB codebase so I haven't attempted to discover what the cause of the issue is though I may poke around anyway

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Closed] (JENA-199) BindingBase can hit a null pointer exception on certain queries against a TDB dataset

Posted by "Rob Vesse (Closed) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/JENA-199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rob Vesse closed JENA-199.
--------------------------

    Resolution: Not A Problem

Bug was due to using an old corrupt database created with an old Fuseki/TDB snapshot - using an up to data Fuseki/TDB snapshot with a cleanly built database does not exhibit the problem
                
> BindingBase can hit a null pointer exception on certain queries against a TDB dataset
> -------------------------------------------------------------------------------------
>
>                 Key: JENA-199
>                 URL: https://issues.apache.org/jira/browse/JENA-199
>             Project: Jena
>          Issue Type: Bug
>          Components: TDB
>            Reporter: Rob Vesse
>              Labels: csv, results, sparql, tdb, tsv
>         Attachments: 5b.txt, 8.txt, TDBEmptyOutput.java, sp2b10k.nt
>
>
> This is a strange bug which I have been unable to reduce to a more minimal example than the files I will attach so I apologize for that.
> Essentially the problem manifests as follows, when using a TDB dataset with Fuseki some queries will return blank output if the user requests Text, CSV or TSV.  When using XML/JSON the output is fine.
> The test data used is SP2B 10k, two of the SP2B queries that exhibit this issue are as follows:
> PREFIX rdf:   <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> PREFIX foaf:  <http://xmlns.com/foaf/0.1/>
> PREFIX bench: <http://localhost/vocabulary/bench/>
> PREFIX dc:    <http://purl.org/dc/elements/1.1/>
> SELECT DISTINCT ?person ?name
> WHERE {
>   ?article rdf:type bench:Article .
>   ?article dc:creator ?person .
>   ?inproc rdf:type bench:Inproceedings .
>   ?inproc dc:creator ?person .
>   ?person foaf:name ?name
> }
> And:
> PREFIX xsd:  <http://www.w3.org/2001/XMLSchema#> 
> PREFIX rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> PREFIX foaf: <http://xmlns.com/foaf/0.1/>
> PREFIX dc:   <http://purl.org/dc/elements/1.1/>
> SELECT DISTINCT ?name
> WHERE {
>   ?erdoes rdf:type foaf:Person .
>   ?erdoes foaf:name "Paul Erdoes"^^xsd:string .
>   {
>     ?document dc:creator ?erdoes .
>     ?document dc:creator ?author .
>     ?document2 dc:creator ?author .
>     ?document2 dc:creator ?author2 .
>     ?author2 foaf:name ?name
>     FILTER (?author!=?erdoes &&
>             ?document2!=?document &&
>             ?author2!=?erdoes &&
>             ?author2!=?author)
>   } UNION {
>     ?document dc:creator ?erdoes.
>     ?document dc:creator ?author.
>     ?author foaf:name ?name
>     FILTER (?author!=?erdoes)
>   }
> }
> I will attach these as files as well for convenience.
> If you run Fuseki with a memory dataset using the --mem option, load this data and run the same queries the Text, CSV and TSV output works fine.  This implies that there is something in the TDB code related to its return of results or iterators which somehow causes the Text, CSV and TSV formatters to either error or to believe that they have no results to format.
> I'm completely unfamiliar with the TDB codebase so I haven't attempted to discover what the cause of the issue is though I may poke around anyway

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (JENA-199) Result Sets returned from TDB give empty output for Text, CSV and TSV formats for certain queries

Posted by "Andy Seaborne (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/JENA-199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13189832#comment-13189832 ] 

Andy Seaborne commented on JENA-199:
------------------------------------

I can't think of anything TDB specific (but it's been a while since I looked at that code).

There is a lot of change going on in TDB, and hence Fuseki; exactly which version is this?  Some of the bug fixes around December/(v early January?) might account for it if "blank output" means there is a slot in TSV, it's just empty.
                
> Result Sets returned from TDB give empty output for Text, CSV and TSV formats for certain queries
> -------------------------------------------------------------------------------------------------
>
>                 Key: JENA-199
>                 URL: https://issues.apache.org/jira/browse/JENA-199
>             Project: Jena
>          Issue Type: Bug
>          Components: TDB
>            Reporter: Rob Vesse
>              Labels: csv, results, sparql, tdb, tsv
>         Attachments: 5b.txt, 8.txt, sp2b10k.nt
>
>
> This is a strange bug which I have been unable to reduce to a more minimal example than the files I will attach so I apologize for that.
> Essentially the problem manifests as follows, when using a TDB dataset with Fuseki some queries will return blank output if the user requests Text, CSV or TSV.  When using XML/JSON the output is fine.
> The test data used is SP2B 10k, two of the SP2B queries that exhibit this issue are as follows:
> PREFIX rdf:   <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> PREFIX foaf:  <http://xmlns.com/foaf/0.1/>
> PREFIX bench: <http://localhost/vocabulary/bench/>
> PREFIX dc:    <http://purl.org/dc/elements/1.1/>
> SELECT DISTINCT ?person ?name
> WHERE {
>   ?article rdf:type bench:Article .
>   ?article dc:creator ?person .
>   ?inproc rdf:type bench:Inproceedings .
>   ?inproc dc:creator ?person .
>   ?person foaf:name ?name
> }
> And:
> PREFIX xsd:  <http://www.w3.org/2001/XMLSchema#> 
> PREFIX rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> PREFIX foaf: <http://xmlns.com/foaf/0.1/>
> PREFIX dc:   <http://purl.org/dc/elements/1.1/>
> SELECT DISTINCT ?name
> WHERE {
>   ?erdoes rdf:type foaf:Person .
>   ?erdoes foaf:name "Paul Erdoes"^^xsd:string .
>   {
>     ?document dc:creator ?erdoes .
>     ?document dc:creator ?author .
>     ?document2 dc:creator ?author .
>     ?document2 dc:creator ?author2 .
>     ?author2 foaf:name ?name
>     FILTER (?author!=?erdoes &&
>             ?document2!=?document &&
>             ?author2!=?erdoes &&
>             ?author2!=?author)
>   } UNION {
>     ?document dc:creator ?erdoes.
>     ?document dc:creator ?author.
>     ?author foaf:name ?name
>     FILTER (?author!=?erdoes)
>   }
> }
> I will attach these as files as well for convenience.
> If you run Fuseki with a memory dataset using the --mem option, load this data and run the same queries the Text, CSV and TSV output works fine.  This implies that there is something in the TDB code related to its return of results or iterators which somehow causes the Text, CSV and TSV formatters to either error or to believe that they have no results to format.
> I'm completely unfamiliar with the TDB codebase so I haven't attempted to discover what the cause of the issue is though I may poke around anyway

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (JENA-199) BindingBase can hit a null pointer exception on certain queries against a TDB dataset

Posted by "Andy Seaborne (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/JENA-199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13190496#comment-13190496 ] 

Andy Seaborne commented on JENA-199:
------------------------------------

Using the current development TDB with ARQ from Jena 2.7.0, I can run the code either way round.  I get the same number of lines of output and no exceptions.

Also, I ran it with ARQ with the proposed patch for JENA-200 / TSV reading.

Looking at the code for BindingBase.hashCode, node is null.  This is a symptom of old bugs in TDB.  I thought they would have been fixed by the build of 12-Jan but they were DB-corrupting so if the data is older, then it might have already been broken.

Exact code I'm running:
https://svn.apache.org/repos/asf/incubator/jena/Scratch/AFS/Jena-Dev/trunk/src/dev/Jena199_BadBindingTDB.java

See script "dwim" for how I'm running the code.

Rob - if you have a moment, could you try with latest TDB, and a cleanly built database?  Thanks.

                
> BindingBase can hit a null pointer exception on certain queries against a TDB dataset
> -------------------------------------------------------------------------------------
>
>                 Key: JENA-199
>                 URL: https://issues.apache.org/jira/browse/JENA-199
>             Project: Jena
>          Issue Type: Bug
>          Components: TDB
>            Reporter: Rob Vesse
>              Labels: csv, results, sparql, tdb, tsv
>         Attachments: 5b.txt, 8.txt, TDBEmptyOutput.java, sp2b10k.nt
>
>
> This is a strange bug which I have been unable to reduce to a more minimal example than the files I will attach so I apologize for that.
> Essentially the problem manifests as follows, when using a TDB dataset with Fuseki some queries will return blank output if the user requests Text, CSV or TSV.  When using XML/JSON the output is fine.
> The test data used is SP2B 10k, two of the SP2B queries that exhibit this issue are as follows:
> PREFIX rdf:   <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> PREFIX foaf:  <http://xmlns.com/foaf/0.1/>
> PREFIX bench: <http://localhost/vocabulary/bench/>
> PREFIX dc:    <http://purl.org/dc/elements/1.1/>
> SELECT DISTINCT ?person ?name
> WHERE {
>   ?article rdf:type bench:Article .
>   ?article dc:creator ?person .
>   ?inproc rdf:type bench:Inproceedings .
>   ?inproc dc:creator ?person .
>   ?person foaf:name ?name
> }
> And:
> PREFIX xsd:  <http://www.w3.org/2001/XMLSchema#> 
> PREFIX rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> PREFIX foaf: <http://xmlns.com/foaf/0.1/>
> PREFIX dc:   <http://purl.org/dc/elements/1.1/>
> SELECT DISTINCT ?name
> WHERE {
>   ?erdoes rdf:type foaf:Person .
>   ?erdoes foaf:name "Paul Erdoes"^^xsd:string .
>   {
>     ?document dc:creator ?erdoes .
>     ?document dc:creator ?author .
>     ?document2 dc:creator ?author .
>     ?document2 dc:creator ?author2 .
>     ?author2 foaf:name ?name
>     FILTER (?author!=?erdoes &&
>             ?document2!=?document &&
>             ?author2!=?erdoes &&
>             ?author2!=?author)
>   } UNION {
>     ?document dc:creator ?erdoes.
>     ?document dc:creator ?author.
>     ?author foaf:name ?name
>     FILTER (?author!=?erdoes)
>   }
> }
> I will attach these as files as well for convenience.
> If you run Fuseki with a memory dataset using the --mem option, load this data and run the same queries the Text, CSV and TSV output works fine.  This implies that there is something in the TDB code related to its return of results or iterators which somehow causes the Text, CSV and TSV formatters to either error or to believe that they have no results to format.
> I'm completely unfamiliar with the TDB codebase so I haven't attempted to discover what the cause of the issue is though I may poke around anyway

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (JENA-199) BindingBase can hit a null pointer exception on certain queries against a TDB dataset

Posted by "Rob Vesse (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/JENA-199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rob Vesse updated JENA-199:
---------------------------

    Attachment: TDBEmptyOutput.java

Attached code which demonstrates the issue - allows you to run with the data as a TDB dataset or as an ARQ in-memory dataset.

With TDB dataset I see a NPE in BindingBase.hashCode() whereas with ARQ I do not - the implication is the QueryIterator that TDB returns can return malformed Binding instances which have/return nulls for some variables OR whose variables list is incorrect so BindingBase tries to access variables that don't exist in the Binding
                
> BindingBase can hit a null pointer exception on certain queries against a TDB dataset
> -------------------------------------------------------------------------------------
>
>                 Key: JENA-199
>                 URL: https://issues.apache.org/jira/browse/JENA-199
>             Project: Jena
>          Issue Type: Bug
>          Components: TDB
>            Reporter: Rob Vesse
>              Labels: csv, results, sparql, tdb, tsv
>         Attachments: 5b.txt, 8.txt, TDBEmptyOutput.java, sp2b10k.nt
>
>
> This is a strange bug which I have been unable to reduce to a more minimal example than the files I will attach so I apologize for that.
> Essentially the problem manifests as follows, when using a TDB dataset with Fuseki some queries will return blank output if the user requests Text, CSV or TSV.  When using XML/JSON the output is fine.
> The test data used is SP2B 10k, two of the SP2B queries that exhibit this issue are as follows:
> PREFIX rdf:   <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> PREFIX foaf:  <http://xmlns.com/foaf/0.1/>
> PREFIX bench: <http://localhost/vocabulary/bench/>
> PREFIX dc:    <http://purl.org/dc/elements/1.1/>
> SELECT DISTINCT ?person ?name
> WHERE {
>   ?article rdf:type bench:Article .
>   ?article dc:creator ?person .
>   ?inproc rdf:type bench:Inproceedings .
>   ?inproc dc:creator ?person .
>   ?person foaf:name ?name
> }
> And:
> PREFIX xsd:  <http://www.w3.org/2001/XMLSchema#> 
> PREFIX rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> PREFIX foaf: <http://xmlns.com/foaf/0.1/>
> PREFIX dc:   <http://purl.org/dc/elements/1.1/>
> SELECT DISTINCT ?name
> WHERE {
>   ?erdoes rdf:type foaf:Person .
>   ?erdoes foaf:name "Paul Erdoes"^^xsd:string .
>   {
>     ?document dc:creator ?erdoes .
>     ?document dc:creator ?author .
>     ?document2 dc:creator ?author .
>     ?document2 dc:creator ?author2 .
>     ?author2 foaf:name ?name
>     FILTER (?author!=?erdoes &&
>             ?document2!=?document &&
>             ?author2!=?erdoes &&
>             ?author2!=?author)
>   } UNION {
>     ?document dc:creator ?erdoes.
>     ?document dc:creator ?author.
>     ?author foaf:name ?name
>     FILTER (?author!=?erdoes)
>   }
> }
> I will attach these as files as well for convenience.
> If you run Fuseki with a memory dataset using the --mem option, load this data and run the same queries the Text, CSV and TSV output works fine.  This implies that there is something in the TDB code related to its return of results or iterators which somehow causes the Text, CSV and TSV formatters to either error or to believe that they have no results to format.
> I'm completely unfamiliar with the TDB codebase so I haven't attempted to discover what the cause of the issue is though I may poke around anyway

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (JENA-199) Result Sets returned from TDB give empty output for Text, CSV and TSV formats for certain queries

Posted by "Rob Vesse (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/JENA-199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13189452#comment-13189452 ] 

Rob Vesse commented on JENA-199:
--------------------------------

Ok so after some playing I have reduced 5b down to a simpler query that exhibits this behavior:

PREFIX foaf:  <http://xmlns.com/foaf/0.1/>
PREFIX dc:    <http://purl.org/dc/elements/1.1/>

SELECT DISTINCT ?person ?name
WHERE {
  ?article dc:creator ?person .
  ?person foaf:name ?name .
}

Interestingly as soon as DISTINCT is removed then the Text, CSV and TSV output works fine.

So I'm still uncertain whether this is a TDB or ARQ issue, problem is that this reduced query works fine with an ARQ in-memory dataset so I'm still leaning towards it being caused by some interaction of TDB and the DISTINCT operator

Yet when I look at OpExecutorTDB I see:

    @Override
    protected QueryIterator execute(OpDistinct opDistinct, QueryIterator input)
    {
        return super.execute(opDistinct, input) ;
    }

Which leaves me none the wiser, following the code I end up in QueryIterDistinct but I assume an ARQ in-memory dataset would also use QueryIterDistinct so why should the query not return properly for some formats on TDB and yet return fine on the same formats with an ARQ in-memory dataset?
                
> Result Sets returned from TDB give empty output for Text, CSV and TSV formats for certain queries
> -------------------------------------------------------------------------------------------------
>
>                 Key: JENA-199
>                 URL: https://issues.apache.org/jira/browse/JENA-199
>             Project: Jena
>          Issue Type: Bug
>          Components: TDB
>            Reporter: Rob Vesse
>              Labels: csv, results, sparql, tdb, tsv
>         Attachments: 5b.txt, 8.txt, sp2b10k.nt
>
>
> This is a strange bug which I have been unable to reduce to a more minimal example than the files I will attach so I apologize for that.
> Essentially the problem manifests as follows, when using a TDB dataset with Fuseki some queries will return blank output if the user requests Text, CSV or TSV.  When using XML/JSON the output is fine.
> The test data used is SP2B 10k, two of the SP2B queries that exhibit this issue are as follows:
> PREFIX rdf:   <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> PREFIX foaf:  <http://xmlns.com/foaf/0.1/>
> PREFIX bench: <http://localhost/vocabulary/bench/>
> PREFIX dc:    <http://purl.org/dc/elements/1.1/>
> SELECT DISTINCT ?person ?name
> WHERE {
>   ?article rdf:type bench:Article .
>   ?article dc:creator ?person .
>   ?inproc rdf:type bench:Inproceedings .
>   ?inproc dc:creator ?person .
>   ?person foaf:name ?name
> }
> And:
> PREFIX xsd:  <http://www.w3.org/2001/XMLSchema#> 
> PREFIX rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> PREFIX foaf: <http://xmlns.com/foaf/0.1/>
> PREFIX dc:   <http://purl.org/dc/elements/1.1/>
> SELECT DISTINCT ?name
> WHERE {
>   ?erdoes rdf:type foaf:Person .
>   ?erdoes foaf:name "Paul Erdoes"^^xsd:string .
>   {
>     ?document dc:creator ?erdoes .
>     ?document dc:creator ?author .
>     ?document2 dc:creator ?author .
>     ?document2 dc:creator ?author2 .
>     ?author2 foaf:name ?name
>     FILTER (?author!=?erdoes &&
>             ?document2!=?document &&
>             ?author2!=?erdoes &&
>             ?author2!=?author)
>   } UNION {
>     ?document dc:creator ?erdoes.
>     ?document dc:creator ?author.
>     ?author foaf:name ?name
>     FILTER (?author!=?erdoes)
>   }
> }
> I will attach these as files as well for convenience.
> If you run Fuseki with a memory dataset using the --mem option, load this data and run the same queries the Text, CSV and TSV output works fine.  This implies that there is something in the TDB code related to its return of results or iterators which somehow causes the Text, CSV and TSV formatters to either error or to believe that they have no results to format.
> I'm completely unfamiliar with the TDB codebase so I haven't attempted to discover what the cause of the issue is though I may poke around anyway

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (JENA-199) Result Sets returned from TDB give empty output for Text, CSV and TSV formats for certain queries

Posted by "Rob Vesse (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/JENA-199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rob Vesse updated JENA-199:
---------------------------

    Attachment: 8.txt
                5b.txt
                sp2b10k.nt

SP2B 10k dataset and queries 5b and 8 which exhibit this problem
                
> Result Sets returned from TDB give empty output for Text, CSV and TSV formats for certain queries
> -------------------------------------------------------------------------------------------------
>
>                 Key: JENA-199
>                 URL: https://issues.apache.org/jira/browse/JENA-199
>             Project: Jena
>          Issue Type: Bug
>          Components: TDB
>            Reporter: Rob Vesse
>              Labels: csv, results, sparql, tdb, tsv
>         Attachments: 5b.txt, 8.txt, sp2b10k.nt
>
>
> This is a strange bug which I have been unable to reduce to a more minimal example than the files I will attach so I apologize for that.
> Essentially the problem manifests as follows, when using a TDB dataset with Fuseki some queries will return blank output if the user requests Text, CSV or TSV.  When using XML/JSON the output is fine.
> The test data used is SP2B 10k, two of the SP2B queries that exhibit this issue are as follows:
> PREFIX rdf:   <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> PREFIX foaf:  <http://xmlns.com/foaf/0.1/>
> PREFIX bench: <http://localhost/vocabulary/bench/>
> PREFIX dc:    <http://purl.org/dc/elements/1.1/>
> SELECT DISTINCT ?person ?name
> WHERE {
>   ?article rdf:type bench:Article .
>   ?article dc:creator ?person .
>   ?inproc rdf:type bench:Inproceedings .
>   ?inproc dc:creator ?person .
>   ?person foaf:name ?name
> }
> And:
> PREFIX xsd:  <http://www.w3.org/2001/XMLSchema#> 
> PREFIX rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> PREFIX foaf: <http://xmlns.com/foaf/0.1/>
> PREFIX dc:   <http://purl.org/dc/elements/1.1/>
> SELECT DISTINCT ?name
> WHERE {
>   ?erdoes rdf:type foaf:Person .
>   ?erdoes foaf:name "Paul Erdoes"^^xsd:string .
>   {
>     ?document dc:creator ?erdoes .
>     ?document dc:creator ?author .
>     ?document2 dc:creator ?author .
>     ?document2 dc:creator ?author2 .
>     ?author2 foaf:name ?name
>     FILTER (?author!=?erdoes &&
>             ?document2!=?document &&
>             ?author2!=?erdoes &&
>             ?author2!=?author)
>   } UNION {
>     ?document dc:creator ?erdoes.
>     ?document dc:creator ?author.
>     ?author foaf:name ?name
>     FILTER (?author!=?erdoes)
>   }
> }
> I will attach these as files as well for convenience.
> If you run Fuseki with a memory dataset using the --mem option, load this data and run the same queries the Text, CSV and TSV output works fine.  This implies that there is something in the TDB code related to its return of results or iterators which somehow causes the Text, CSV and TSV formatters to either error or to believe that they have no results to format.
> I'm completely unfamiliar with the TDB codebase so I haven't attempted to discover what the cause of the issue is though I may poke around anyway

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (JENA-199) BindingBase can hit a null pointer exception on certain queries against a TDB dataset

Posted by "Rob Vesse (Issue Comment Edited) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/JENA-199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13190011#comment-13190011 ] 

Rob Vesse edited comment on JENA-199 at 1/20/12 9:57 PM:
---------------------------------------------------------

So I got into this a lot more and was able to reproduce the issue in code and isolate it from Fuseki

It appears that something in this data causes TDB to return a null for one of the variables which is bizarre because there is no optional variables and it shouldn't be possible to set a null on a Binding AFAIK.

Stack Trace from the code I will shortly attach is as follows:

null
java.lang.NullPointerException
	at com.hp.hpl.jena.sparql.engine.binding.BindingBase.hashCode(BindingBase.java:204)
	at com.hp.hpl.jena.sparql.engine.binding.BindingBase.hashCode(BindingBase.java:185)
	at java.util.HashMap.put(HashMap.java:372)
	at java.util.HashSet.add(HashSet.java:200)
	at org.openjena.atlas.data.SortedDataBag.add(SortedDataBag.java:109)
	at org.openjena.atlas.data.DistinctDataNet.netAdd(DistinctDataNet.java:58)
	at com.hp.hpl.jena.sparql.engine.iterator.QueryIterDistinct.fill(QueryIterDistinct.java:87)
	at com.hp.hpl.jena.sparql.engine.iterator.QueryIterDistinct.moveToNextBinding(QueryIterDistinct.java:118)
	at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.nextBinding(QueryIteratorBase.java:152)
	at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.moveToNextBinding(QueryIteratorWrapper.java:43)
	at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.nextBinding(QueryIteratorBase.java:152)
	at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.moveToNextBinding(QueryIteratorWrapper.java:43)
	at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.nextBinding(QueryIteratorBase.java:152)
	at com.hp.hpl.jena.sparql.engine.ResultSetStream.nextBinding(ResultSetStream.java:84)
	at com.hp.hpl.jena.sparql.engine.ResultSetStream.nextSolution(ResultSetStream.java:102)
	at com.hp.hpl.jena.sparql.engine.ResultSetStream.next(ResultSetStream.java:111)
	at com.hp.hpl.jena.sparql.resultset.ResultSetApply.apply(ResultSetApply.java:43)
	at com.hp.hpl.jena.sparql.resultset.XMLOutput.format(XMLOutput.java:52)
	at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:481)
	at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:459)
	at bugs.tdb.TDBEmptyOutput.test(TDBEmptyOutput.java:44)
	at bugs.tdb.TDBEmptyOutput.main(TDBEmptyOutput.java:29)

So the lack of Text, CSV and TSV output appears to be that it hits this error partway through and all those writers either do a single flush at the end of output or in the case of the Text output have to do a complete iteration over the ResultSet before they can output or they reach whatever Jetty's flush threshold is.  Hence the lack of outputs for these formats yet output in the XML/JSON cases.

What I failed to notice previously was that the XML/JSON output was incomplete (though strangely it was valid) but it doesn't include as many results as it should.  Something in the Fuseki layer appears to catch and clean up the incomplete output whereas when evaluated via the code you get larger but incomplete output.
                
      was (Author: rvesse):
    So I got into this a lot more and was able to reproduce the issue in code and isolate it from Fuseki

It appears that something in this data causes TDB to return a null for one of the variables which is bizarre because there is no optional variables and it shouldn't be possible to set a null on a Binding AFAIK.

Stack Trace from the code I will shortly attach is as follows:

null
java.lang.NullPointerException
	at com.hp.hpl.jena.sparql.engine.binding.BindingBase.hashCode(BindingBase.java:204)
	at com.hp.hpl.jena.sparql.engine.binding.BindingBase.hashCode(BindingBase.java:185)
	at java.util.HashMap.put(HashMap.java:372)
	at java.util.HashSet.add(HashSet.java:200)
	at org.openjena.atlas.data.SortedDataBag.add(SortedDataBag.java:109)
	at org.openjena.atlas.data.DistinctDataNet.netAdd(DistinctDataNet.java:58)
	at com.hp.hpl.jena.sparql.engine.iterator.QueryIterDistinct.fill(QueryIterDistinct.java:87)
	at com.hp.hpl.jena.sparql.engine.iterator.QueryIterDistinct.moveToNextBinding(QueryIterDistinct.java:118)
	at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.nextBinding(QueryIteratorBase.java:152)
	at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.moveToNextBinding(QueryIteratorWrapper.java:43)
	at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.nextBinding(QueryIteratorBase.java:152)
	at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.moveToNextBinding(QueryIteratorWrapper.java:43)
	at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.nextBinding(QueryIteratorBase.java:152)
	at com.hp.hpl.jena.sparql.engine.ResultSetStream.nextBinding(ResultSetStream.java:84)
	at com.hp.hpl.jena.sparql.engine.ResultSetStream.nextSolution(ResultSetStream.java:102)
	at com.hp.hpl.jena.sparql.engine.ResultSetStream.next(ResultSetStream.java:111)
	at com.hp.hpl.jena.sparql.resultset.ResultSetApply.apply(ResultSetApply.java:43)
	at com.hp.hpl.jena.sparql.resultset.XMLOutput.format(XMLOutput.java:52)
	at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:481)
	at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:459)
	at bugs.tdb.TDBEmptyOutput.test(TDBEmptyOutput.java:44)
	at bugs.tdb.TDBEmptyOutput.main(TDBEmptyOutput.java:29)

So the lack of Text, CSV and TSV output appears to be that it hits this error partway through and all those writers either do a single flush at the end of output or in the case of the Text output have to do a complete iteration over the ResultSet before they can output.  Hence the lack of outputs and output in the XML/JSON cases, what I failed to notice previously was that the XML/JSON output was incomplete (it was valid) but it doesn't include as many results as it should.

Something in the Fuseki layer appears to catch and clean up the incomplete input whereas when evaluated via the code you get larger but incomplete output.
                  
> BindingBase can hit a null pointer exception on certain queries against a TDB dataset
> -------------------------------------------------------------------------------------
>
>                 Key: JENA-199
>                 URL: https://issues.apache.org/jira/browse/JENA-199
>             Project: Jena
>          Issue Type: Bug
>          Components: TDB
>            Reporter: Rob Vesse
>              Labels: csv, results, sparql, tdb, tsv
>         Attachments: 5b.txt, 8.txt, TDBEmptyOutput.java, sp2b10k.nt
>
>
> This is a strange bug which I have been unable to reduce to a more minimal example than the files I will attach so I apologize for that.
> Essentially the problem manifests as follows, when using a TDB dataset with Fuseki some queries will return blank output if the user requests Text, CSV or TSV.  When using XML/JSON the output is fine.
> The test data used is SP2B 10k, two of the SP2B queries that exhibit this issue are as follows:
> PREFIX rdf:   <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> PREFIX foaf:  <http://xmlns.com/foaf/0.1/>
> PREFIX bench: <http://localhost/vocabulary/bench/>
> PREFIX dc:    <http://purl.org/dc/elements/1.1/>
> SELECT DISTINCT ?person ?name
> WHERE {
>   ?article rdf:type bench:Article .
>   ?article dc:creator ?person .
>   ?inproc rdf:type bench:Inproceedings .
>   ?inproc dc:creator ?person .
>   ?person foaf:name ?name
> }
> And:
> PREFIX xsd:  <http://www.w3.org/2001/XMLSchema#> 
> PREFIX rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> PREFIX foaf: <http://xmlns.com/foaf/0.1/>
> PREFIX dc:   <http://purl.org/dc/elements/1.1/>
> SELECT DISTINCT ?name
> WHERE {
>   ?erdoes rdf:type foaf:Person .
>   ?erdoes foaf:name "Paul Erdoes"^^xsd:string .
>   {
>     ?document dc:creator ?erdoes .
>     ?document dc:creator ?author .
>     ?document2 dc:creator ?author .
>     ?document2 dc:creator ?author2 .
>     ?author2 foaf:name ?name
>     FILTER (?author!=?erdoes &&
>             ?document2!=?document &&
>             ?author2!=?erdoes &&
>             ?author2!=?author)
>   } UNION {
>     ?document dc:creator ?erdoes.
>     ?document dc:creator ?author.
>     ?author foaf:name ?name
>     FILTER (?author!=?erdoes)
>   }
> }
> I will attach these as files as well for convenience.
> If you run Fuseki with a memory dataset using the --mem option, load this data and run the same queries the Text, CSV and TSV output works fine.  This implies that there is something in the TDB code related to its return of results or iterators which somehow causes the Text, CSV and TSV formatters to either error or to believe that they have no results to format.
> I'm completely unfamiliar with the TDB codebase so I haven't attempted to discover what the cause of the issue is though I may poke around anyway

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (JENA-199) BindingBase can hit a null pointer exception on certain queries against a TDB dataset

Posted by "Rob Vesse (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/JENA-199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rob Vesse updated JENA-199:
---------------------------

    Summary: BindingBase can hit a null pointer exception on certain queries against a TDB dataset  (was: Result Sets returned from TDB give empty output for Text, CSV and TSV formats for certain queries)

Edited title to reflect the actual cause of the issue
                
> BindingBase can hit a null pointer exception on certain queries against a TDB dataset
> -------------------------------------------------------------------------------------
>
>                 Key: JENA-199
>                 URL: https://issues.apache.org/jira/browse/JENA-199
>             Project: Jena
>          Issue Type: Bug
>          Components: TDB
>            Reporter: Rob Vesse
>              Labels: csv, results, sparql, tdb, tsv
>         Attachments: 5b.txt, 8.txt, sp2b10k.nt
>
>
> This is a strange bug which I have been unable to reduce to a more minimal example than the files I will attach so I apologize for that.
> Essentially the problem manifests as follows, when using a TDB dataset with Fuseki some queries will return blank output if the user requests Text, CSV or TSV.  When using XML/JSON the output is fine.
> The test data used is SP2B 10k, two of the SP2B queries that exhibit this issue are as follows:
> PREFIX rdf:   <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> PREFIX foaf:  <http://xmlns.com/foaf/0.1/>
> PREFIX bench: <http://localhost/vocabulary/bench/>
> PREFIX dc:    <http://purl.org/dc/elements/1.1/>
> SELECT DISTINCT ?person ?name
> WHERE {
>   ?article rdf:type bench:Article .
>   ?article dc:creator ?person .
>   ?inproc rdf:type bench:Inproceedings .
>   ?inproc dc:creator ?person .
>   ?person foaf:name ?name
> }
> And:
> PREFIX xsd:  <http://www.w3.org/2001/XMLSchema#> 
> PREFIX rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> PREFIX foaf: <http://xmlns.com/foaf/0.1/>
> PREFIX dc:   <http://purl.org/dc/elements/1.1/>
> SELECT DISTINCT ?name
> WHERE {
>   ?erdoes rdf:type foaf:Person .
>   ?erdoes foaf:name "Paul Erdoes"^^xsd:string .
>   {
>     ?document dc:creator ?erdoes .
>     ?document dc:creator ?author .
>     ?document2 dc:creator ?author .
>     ?document2 dc:creator ?author2 .
>     ?author2 foaf:name ?name
>     FILTER (?author!=?erdoes &&
>             ?document2!=?document &&
>             ?author2!=?erdoes &&
>             ?author2!=?author)
>   } UNION {
>     ?document dc:creator ?erdoes.
>     ?document dc:creator ?author.
>     ?author foaf:name ?name
>     FILTER (?author!=?erdoes)
>   }
> }
> I will attach these as files as well for convenience.
> If you run Fuseki with a memory dataset using the --mem option, load this data and run the same queries the Text, CSV and TSV output works fine.  This implies that there is something in the TDB code related to its return of results or iterators which somehow causes the Text, CSV and TSV formatters to either error or to believe that they have no results to format.
> I'm completely unfamiliar with the TDB codebase so I haven't attempted to discover what the cause of the issue is though I may poke around anyway

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (JENA-199) BindingBase can hit a null pointer exception on certain queries against a TDB dataset

Posted by "Rob Vesse (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/JENA-199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13190011#comment-13190011 ] 

Rob Vesse commented on JENA-199:
--------------------------------

So I got into this a lot more and was able to reproduce the issue in code and isolate it from Fuseki

It appears that something in this data causes TDB to return a null for one of the variables which is bizarre because there is no optional variables and it shouldn't be possible to set a null on a Binding AFAIK.

Stack Trace from the code I will shortly attach is as follows:

null
java.lang.NullPointerException
	at com.hp.hpl.jena.sparql.engine.binding.BindingBase.hashCode(BindingBase.java:204)
	at com.hp.hpl.jena.sparql.engine.binding.BindingBase.hashCode(BindingBase.java:185)
	at java.util.HashMap.put(HashMap.java:372)
	at java.util.HashSet.add(HashSet.java:200)
	at org.openjena.atlas.data.SortedDataBag.add(SortedDataBag.java:109)
	at org.openjena.atlas.data.DistinctDataNet.netAdd(DistinctDataNet.java:58)
	at com.hp.hpl.jena.sparql.engine.iterator.QueryIterDistinct.fill(QueryIterDistinct.java:87)
	at com.hp.hpl.jena.sparql.engine.iterator.QueryIterDistinct.moveToNextBinding(QueryIterDistinct.java:118)
	at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.nextBinding(QueryIteratorBase.java:152)
	at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.moveToNextBinding(QueryIteratorWrapper.java:43)
	at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.nextBinding(QueryIteratorBase.java:152)
	at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.moveToNextBinding(QueryIteratorWrapper.java:43)
	at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.nextBinding(QueryIteratorBase.java:152)
	at com.hp.hpl.jena.sparql.engine.ResultSetStream.nextBinding(ResultSetStream.java:84)
	at com.hp.hpl.jena.sparql.engine.ResultSetStream.nextSolution(ResultSetStream.java:102)
	at com.hp.hpl.jena.sparql.engine.ResultSetStream.next(ResultSetStream.java:111)
	at com.hp.hpl.jena.sparql.resultset.ResultSetApply.apply(ResultSetApply.java:43)
	at com.hp.hpl.jena.sparql.resultset.XMLOutput.format(XMLOutput.java:52)
	at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:481)
	at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:459)
	at bugs.tdb.TDBEmptyOutput.test(TDBEmptyOutput.java:44)
	at bugs.tdb.TDBEmptyOutput.main(TDBEmptyOutput.java:29)

So the lack of Text, CSV and TSV output appears to be that it hits this error partway through and all those writers either do a single flush at the end of output or in the case of the Text output have to do a complete iteration over the ResultSet before they can output.  Hence the lack of outputs and output in the XML/JSON cases, what I failed to notice previously was that the XML/JSON output was incomplete (it was valid) but it doesn't include as many results as it should.

Something in the Fuseki layer appears to catch and clean up the incomplete input whereas when evaluated via the code you get larger but incomplete output.
                
> BindingBase can hit a null pointer exception on certain queries against a TDB dataset
> -------------------------------------------------------------------------------------
>
>                 Key: JENA-199
>                 URL: https://issues.apache.org/jira/browse/JENA-199
>             Project: Jena
>          Issue Type: Bug
>          Components: TDB
>            Reporter: Rob Vesse
>              Labels: csv, results, sparql, tdb, tsv
>         Attachments: 5b.txt, 8.txt, sp2b10k.nt
>
>
> This is a strange bug which I have been unable to reduce to a more minimal example than the files I will attach so I apologize for that.
> Essentially the problem manifests as follows, when using a TDB dataset with Fuseki some queries will return blank output if the user requests Text, CSV or TSV.  When using XML/JSON the output is fine.
> The test data used is SP2B 10k, two of the SP2B queries that exhibit this issue are as follows:
> PREFIX rdf:   <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> PREFIX foaf:  <http://xmlns.com/foaf/0.1/>
> PREFIX bench: <http://localhost/vocabulary/bench/>
> PREFIX dc:    <http://purl.org/dc/elements/1.1/>
> SELECT DISTINCT ?person ?name
> WHERE {
>   ?article rdf:type bench:Article .
>   ?article dc:creator ?person .
>   ?inproc rdf:type bench:Inproceedings .
>   ?inproc dc:creator ?person .
>   ?person foaf:name ?name
> }
> And:
> PREFIX xsd:  <http://www.w3.org/2001/XMLSchema#> 
> PREFIX rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> PREFIX foaf: <http://xmlns.com/foaf/0.1/>
> PREFIX dc:   <http://purl.org/dc/elements/1.1/>
> SELECT DISTINCT ?name
> WHERE {
>   ?erdoes rdf:type foaf:Person .
>   ?erdoes foaf:name "Paul Erdoes"^^xsd:string .
>   {
>     ?document dc:creator ?erdoes .
>     ?document dc:creator ?author .
>     ?document2 dc:creator ?author .
>     ?document2 dc:creator ?author2 .
>     ?author2 foaf:name ?name
>     FILTER (?author!=?erdoes &&
>             ?document2!=?document &&
>             ?author2!=?erdoes &&
>             ?author2!=?author)
>   } UNION {
>     ?document dc:creator ?erdoes.
>     ?document dc:creator ?author.
>     ?author foaf:name ?name
>     FILTER (?author!=?erdoes)
>   }
> }
> I will attach these as files as well for convenience.
> If you run Fuseki with a memory dataset using the --mem option, load this data and run the same queries the Text, CSV and TSV output works fine.  This implies that there is something in the TDB code related to its return of results or iterators which somehow causes the Text, CSV and TSV formatters to either error or to believe that they have no results to format.
> I'm completely unfamiliar with the TDB codebase so I haven't attempted to discover what the cause of the issue is though I may poke around anyway

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira