You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jena.apache.org by "Richard Cyganiak (Created) (JIRA)" <ji...@apache.org> on 2011/11/02 16:09:32 UTC

[jira] [Created] (JENA-153) Query bug for large integers

Query bug for large integers
----------------------------

                 Key: JENA-153
                 URL: https://issues.apache.org/jira/browse/JENA-153
             Project: Jena
          Issue Type: Bug
          Components: ARQ, Jena
            Reporter: Richard Cyganiak
            Priority: Minor


ARQ handles small xsd:integers fine, and it handles large xsd:integers fine, but there seems to be some weirdness going on with integers of ~20 digits:

ASK {FILTER (200000/2=100000)} => true
ASK {FILTER (20000000/2=10000000)} => true
ASK {FILTER (2000000000/2=1000000000)} => true
ASK {FILTER (200000000000/2=100000000000)} => true
ASK {FILTER (20000000000000/2=10000000000000)} => true
ASK {FILTER (2000000000000000/2=1000000000000000)} => true
ASK {FILTER (200000000000000000/2=100000000000000000)} => true
ASK {FILTER (20000000000000000000/2=10000000000000000000)} => ***false***
ASK {FILTER (2000000000000000000000/2=1000000000000000000000)} => true
ASK {FILTER (200000000000000000000000/2=100000000000000000000000)} => true
ASK {FILTER (20000000000000000000000000/2=10000000000000000000000000)} => true

These were all tested in http://sparql.org/sparql.html with an arbitrary target graph URI.

It works fine again if dividend and quotient are changed to xsd:decimal:

ASK {FILTER (20000000000000000000.0/2=10000000000000000000.0)} => true

I guess this may have something to do with Java's native long being used for xsd:integers of some size, and BigInteger for others?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (JENA-153) Query bug for large integers

Posted by "Damian Steer (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JENA-153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142226#comment-13142226 ] 

Damian Steer edited comment on JENA-153 at 11/2/11 4:00 PM:
------------------------------------------------------------

I think you're right about an overflow issue here:

select ((20000000000000000000/2) as ?answer) {} => "776627963145224192." ^^<http://www.w3.org/2001/XMLSchema#decimal>
select ((10000000000000000000) as ?answer) {} => "10000000000000000000" ^^<http://www.w3.org/2001/XMLSchema#integer>

Not sure what's going on there.

[edit] But as a guess the latter 'correct' answers are true because both sides are going wrong in the same way.
                
      was (Author: shellac):
    I think you're right about an overflow issue here:

select ((20000000000000000000/2) as ?answer) {} => "776627963145224192." ^^<http://www.w3.org/2001/XMLSchema#decimal>
select ((10000000000000000000) as ?answer) {} => "10000000000000000000" ^^<http://www.w3.org/2001/XMLSchema#integer>

Not sure what's going on there.
                  
> Query bug for large integers
> ----------------------------
>
>                 Key: JENA-153
>                 URL: https://issues.apache.org/jira/browse/JENA-153
>             Project: Jena
>          Issue Type: Bug
>          Components: ARQ, Jena
>            Reporter: Richard Cyganiak
>            Priority: Minor
>
> ARQ handles small xsd:integers fine, and it handles large xsd:integers fine, but there seems to be some weirdness going on with integers of ~20 digits:
> ASK {FILTER (200000/2=100000)} => true
> ASK {FILTER (20000000/2=10000000)} => true
> ASK {FILTER (2000000000/2=1000000000)} => true
> ASK {FILTER (200000000000/2=100000000000)} => true
> ASK {FILTER (20000000000000/2=10000000000000)} => true
> ASK {FILTER (2000000000000000/2=1000000000000000)} => true
> ASK {FILTER (200000000000000000/2=100000000000000000)} => true
> ASK {FILTER (20000000000000000000/2=10000000000000000000)} => ***false***
> ASK {FILTER (2000000000000000000000/2=1000000000000000000000)} => true
> ASK {FILTER (200000000000000000000000/2=100000000000000000000000)} => true
> ASK {FILTER (20000000000000000000000000/2=10000000000000000000000000)} => true
> These were all tested in http://sparql.org/sparql.html with an arbitrary target graph URI.
> It works fine again if dividend and quotient are changed to xsd:decimal:
> ASK {FILTER (20000000000000000000.0/2=10000000000000000000.0)} => true
> I guess this may have something to do with Java's native long being used for xsd:integers of some size, and BigInteger for others?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (JENA-153) xsd:integers larger than java.long.MAX_VALUE silently overflow in ARQ

Posted by "Andy Seaborne (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JENA-153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142369#comment-13142369 ] 

Andy Seaborne commented on JENA-153:
------------------------------------

Internally, ARQ uses BigInteger - the quoted /typedLiterals.html/ refers to jena core and ARQ has it's own, extended value processing as it adds a lot of machinery.

Knowing it's not one single operation or one query is very helpful in narrrowing the places to look - thanks.

Found: NodeValue._setByValue squeezes value into a long for no good reason.

ARQ has a command line tool "arq.qexpr" that evaluates SPARQL expressions.

                
> xsd:integers larger than java.long.MAX_VALUE silently overflow in ARQ
> ---------------------------------------------------------------------
>
>                 Key: JENA-153
>                 URL: https://issues.apache.org/jira/browse/JENA-153
>             Project: Jena
>          Issue Type: Bug
>          Components: ARQ, Jena
>            Reporter: Richard Cyganiak
>            Priority: Minor
>
> ARQ handles small xsd:integers fine, and it handles large xsd:integers fine, but there seems to be some weirdness going on with integers of ~20 digits:
> ASK {FILTER (200000/2=100000)} => true
> ASK {FILTER (20000000/2=10000000)} => true
> ASK {FILTER (2000000000/2=1000000000)} => true
> ASK {FILTER (200000000000/2=100000000000)} => true
> ASK {FILTER (20000000000000/2=10000000000000)} => true
> ASK {FILTER (2000000000000000/2=1000000000000000)} => true
> ASK {FILTER (200000000000000000/2=100000000000000000)} => true
> ASK {FILTER (20000000000000000000/2=10000000000000000000)} => ***false***
> ASK {FILTER (2000000000000000000000/2=1000000000000000000000)} => true
> ASK {FILTER (200000000000000000000000/2=100000000000000000000000)} => true
> ASK {FILTER (20000000000000000000000000/2=10000000000000000000000000)} => true
> These were all tested in http://sparql.org/sparql.html with an arbitrary target graph URI.
> It works fine again if dividend and quotient are changed to xsd:decimal:
> ASK {FILTER (20000000000000000000.0/2=10000000000000000000.0)} => true
> I guess this may have something to do with Java's native long being used for xsd:integers of some size, and BigInteger for others?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (JENA-153) xsd:integers larger than java.long.MAX_VALUE silently overflow in ARQ

Posted by "Richard Cyganiak (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JENA-153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142572#comment-13142572 ] 

Richard Cyganiak commented on JENA-153:
---------------------------------------

Tested latest SVN and it works. Thanks! (3h 22min from report to fix, wow!)
                
> xsd:integers larger than java.long.MAX_VALUE silently overflow in ARQ
> ---------------------------------------------------------------------
>
>                 Key: JENA-153
>                 URL: https://issues.apache.org/jira/browse/JENA-153
>             Project: Jena
>          Issue Type: Bug
>          Components: ARQ, Jena
>            Reporter: Richard Cyganiak
>            Assignee: Andy Seaborne
>            Priority: Minor
>
> ARQ handles small xsd:integers fine, and it handles large xsd:integers fine, but there seems to be some weirdness going on with integers of ~20 digits:
> ASK {FILTER (200000/2=100000)} => true
> ASK {FILTER (20000000/2=10000000)} => true
> ASK {FILTER (2000000000/2=1000000000)} => true
> ASK {FILTER (200000000000/2=100000000000)} => true
> ASK {FILTER (20000000000000/2=10000000000000)} => true
> ASK {FILTER (2000000000000000/2=1000000000000000)} => true
> ASK {FILTER (200000000000000000/2=100000000000000000)} => true
> ASK {FILTER (20000000000000000000/2=10000000000000000000)} => ***false***
> ASK {FILTER (2000000000000000000000/2=1000000000000000000000)} => true
> ASK {FILTER (200000000000000000000000/2=100000000000000000000000)} => true
> ASK {FILTER (20000000000000000000000000/2=10000000000000000000000000)} => true
> These were all tested in http://sparql.org/sparql.html with an arbitrary target graph URI.
> It works fine again if dividend and quotient are changed to xsd:decimal:
> ASK {FILTER (20000000000000000000.0/2=10000000000000000000.0)} => true
> I guess this may have something to do with Java's native long being used for xsd:integers of some size, and BigInteger for others?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Closed] (JENA-153) xsd:integers larger than java.long.MAX_VALUE silently overflow in ARQ

Posted by "Andy Seaborne (Closed) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JENA-153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andy Seaborne closed JENA-153.
------------------------------


Comprehensive reports lead to prompt bug fixes.
                
> xsd:integers larger than java.long.MAX_VALUE silently overflow in ARQ
> ---------------------------------------------------------------------
>
>                 Key: JENA-153
>                 URL: https://issues.apache.org/jira/browse/JENA-153
>             Project: Jena
>          Issue Type: Bug
>          Components: ARQ, Jena
>            Reporter: Richard Cyganiak
>            Assignee: Andy Seaborne
>            Priority: Minor
>
> ARQ handles small xsd:integers fine, and it handles large xsd:integers fine, but there seems to be some weirdness going on with integers of ~20 digits:
> ASK {FILTER (200000/2=100000)} => true
> ASK {FILTER (20000000/2=10000000)} => true
> ASK {FILTER (2000000000/2=1000000000)} => true
> ASK {FILTER (200000000000/2=100000000000)} => true
> ASK {FILTER (20000000000000/2=10000000000000)} => true
> ASK {FILTER (2000000000000000/2=1000000000000000)} => true
> ASK {FILTER (200000000000000000/2=100000000000000000)} => true
> ASK {FILTER (20000000000000000000/2=10000000000000000000)} => ***false***
> ASK {FILTER (2000000000000000000000/2=1000000000000000000000)} => true
> ASK {FILTER (200000000000000000000000/2=100000000000000000000000)} => true
> ASK {FILTER (20000000000000000000000000/2=10000000000000000000000000)} => true
> These were all tested in http://sparql.org/sparql.html with an arbitrary target graph URI.
> It works fine again if dividend and quotient are changed to xsd:decimal:
> ASK {FILTER (20000000000000000000.0/2=10000000000000000000.0)} => true
> I guess this may have something to do with Java's native long being used for xsd:integers of some size, and BigInteger for others?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (JENA-153) xsd:integers larger than java.long.MAX_VALUE silently overflow in ARQ

Posted by "Richard Cyganiak (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JENA-153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Richard Cyganiak updated JENA-153:
----------------------------------

    Summary: xsd:integers larger than java.long.MAX_VALUE silently overflow in ARQ  (was: Query bug for large integers)

I changed the bug's description to better reflect the problem.
                
> xsd:integers larger than java.long.MAX_VALUE silently overflow in ARQ
> ---------------------------------------------------------------------
>
>                 Key: JENA-153
>                 URL: https://issues.apache.org/jira/browse/JENA-153
>             Project: Jena
>          Issue Type: Bug
>          Components: ARQ, Jena
>            Reporter: Richard Cyganiak
>            Priority: Minor
>
> ARQ handles small xsd:integers fine, and it handles large xsd:integers fine, but there seems to be some weirdness going on with integers of ~20 digits:
> ASK {FILTER (200000/2=100000)} => true
> ASK {FILTER (20000000/2=10000000)} => true
> ASK {FILTER (2000000000/2=1000000000)} => true
> ASK {FILTER (200000000000/2=100000000000)} => true
> ASK {FILTER (20000000000000/2=10000000000000)} => true
> ASK {FILTER (2000000000000000/2=1000000000000000)} => true
> ASK {FILTER (200000000000000000/2=100000000000000000)} => true
> ASK {FILTER (20000000000000000000/2=10000000000000000000)} => ***false***
> ASK {FILTER (2000000000000000000000/2=1000000000000000000000)} => true
> ASK {FILTER (200000000000000000000000/2=100000000000000000000000)} => true
> ASK {FILTER (20000000000000000000000000/2=10000000000000000000000000)} => true
> These were all tested in http://sparql.org/sparql.html with an arbitrary target graph URI.
> It works fine again if dividend and quotient are changed to xsd:decimal:
> ASK {FILTER (20000000000000000000.0/2=10000000000000000000.0)} => true
> I guess this may have something to do with Java's native long being used for xsd:integers of some size, and BigInteger for others?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (JENA-153) Query bug for large integers

Posted by "Richard Cyganiak (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JENA-153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142306#comment-13142306 ] 

Richard Cyganiak commented on JENA-153:
---------------------------------------

I think you're right Damian. Very large xsd:integer literals can be *returned* just fine:

SELECT (10000000000000000000) {}
    => "10000000000000000000" ^^<http://www.w3.org/2001/XMLSchema#integer>

But any *arithmetic*, including comparison, seems to be done on a limited-size representation of the integer:

SELECT (10000000000000000000+0) {}
    => "-8446744073709551616" ^^<http://www.w3.org/2001/XMLSchema#integer>
SELECT (10000000000000000000=-8446744073709551616) {}
    => true

Now I guess this just means that ARQ keeps around the lexical form of literals that have been provided, uh, literally in the query, and uses it when returning the literal directly as a query result; but ARQ uses a different representation, based on the datatype, for arithmetics:

SELECT (007) {}
    => "007" ^^<http://www.w3.org/2001/XMLSchema#integer>
SELECT (007+0) {}
    => "7" ^^<http://www.w3.org/2001/XMLSchema#integer>

And this different representation appears to be java.lang.Long:

SELECT (9223372036854775807+0) {}
    => "9223372036854775807" ^^<http://www.w3.org/2001/XMLSchema#integer>
SELECT (9223372036854775808+0) {}
    => "-9223372036854775808" ^^<http://www.w3.org/2001/XMLSchema#integer>

So any value larger than 9223372036854775807 will overflow.

XSD requires support for integers up to ±10^16 in a minimally conforming processor, so using only long internally is ok. But then it also says:

[[
When the datatype validity of a value or literal is uncertain because it exceeds the capacity of a partial implementation, the literal or value must not be treated as invalid, and the unsupported value must not be quietly changed to a supported value.
]]
http://www.w3.org/TR/xmlschema11-2/#partial-implementation

So that seems to be a clear bug – it should either do the correct arithmetic, or raise an exception at *some* point.

I'm surprised to find that arithmetic doesn't use Java's BigIntegers. I found this in the docs:

[[
When parsing an xsd:integer the Java value object used will be an Integer, Long or BigInteger depending on the size of the specific value being represented.
]]
http://openjena.org/how-to/typedLiterals.html
                
> Query bug for large integers
> ----------------------------
>
>                 Key: JENA-153
>                 URL: https://issues.apache.org/jira/browse/JENA-153
>             Project: Jena
>          Issue Type: Bug
>          Components: ARQ, Jena
>            Reporter: Richard Cyganiak
>            Priority: Minor
>
> ARQ handles small xsd:integers fine, and it handles large xsd:integers fine, but there seems to be some weirdness going on with integers of ~20 digits:
> ASK {FILTER (200000/2=100000)} => true
> ASK {FILTER (20000000/2=10000000)} => true
> ASK {FILTER (2000000000/2=1000000000)} => true
> ASK {FILTER (200000000000/2=100000000000)} => true
> ASK {FILTER (20000000000000/2=10000000000000)} => true
> ASK {FILTER (2000000000000000/2=1000000000000000)} => true
> ASK {FILTER (200000000000000000/2=100000000000000000)} => true
> ASK {FILTER (20000000000000000000/2=10000000000000000000)} => ***false***
> ASK {FILTER (2000000000000000000000/2=1000000000000000000000)} => true
> ASK {FILTER (200000000000000000000000/2=100000000000000000000000)} => true
> ASK {FILTER (20000000000000000000000000/2=10000000000000000000000000)} => true
> These were all tested in http://sparql.org/sparql.html with an arbitrary target graph URI.
> It works fine again if dividend and quotient are changed to xsd:decimal:
> ASK {FILTER (20000000000000000000.0/2=10000000000000000000.0)} => true
> I guess this may have something to do with Java's native long being used for xsd:integers of some size, and BigInteger for others?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (JENA-153) Query bug for large integers

Posted by "Damian Steer (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/JENA-153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13142226#comment-13142226 ] 

Damian Steer commented on JENA-153:
-----------------------------------

I think you're right about an overflow issue here:

select ((20000000000000000000/2) as ?answer) {} => "776627963145224192." ^^<http://www.w3.org/2001/XMLSchema#decimal>
select ((10000000000000000000) as ?answer) {} => "10000000000000000000" ^^<http://www.w3.org/2001/XMLSchema#integer>

Not sure what's going on there.
                
> Query bug for large integers
> ----------------------------
>
>                 Key: JENA-153
>                 URL: https://issues.apache.org/jira/browse/JENA-153
>             Project: Jena
>          Issue Type: Bug
>          Components: ARQ, Jena
>            Reporter: Richard Cyganiak
>            Priority: Minor
>
> ARQ handles small xsd:integers fine, and it handles large xsd:integers fine, but there seems to be some weirdness going on with integers of ~20 digits:
> ASK {FILTER (200000/2=100000)} => true
> ASK {FILTER (20000000/2=10000000)} => true
> ASK {FILTER (2000000000/2=1000000000)} => true
> ASK {FILTER (200000000000/2=100000000000)} => true
> ASK {FILTER (20000000000000/2=10000000000000)} => true
> ASK {FILTER (2000000000000000/2=1000000000000000)} => true
> ASK {FILTER (200000000000000000/2=100000000000000000)} => true
> ASK {FILTER (20000000000000000000/2=10000000000000000000)} => ***false***
> ASK {FILTER (2000000000000000000000/2=1000000000000000000000)} => true
> ASK {FILTER (200000000000000000000000/2=100000000000000000000000)} => true
> ASK {FILTER (20000000000000000000000000/2=10000000000000000000000000)} => true
> These were all tested in http://sparql.org/sparql.html with an arbitrary target graph URI.
> It works fine again if dividend and quotient are changed to xsd:decimal:
> ASK {FILTER (20000000000000000000.0/2=10000000000000000000.0)} => true
> I guess this may have something to do with Java's native long being used for xsd:integers of some size, and BigInteger for others?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (JENA-153) xsd:integers larger than java.long.MAX_VALUE silently overflow in ARQ

Posted by "Andy Seaborne (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/JENA-153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andy Seaborne resolved JENA-153.
--------------------------------

    Resolution: Fixed
      Assignee: Andy Seaborne

Fix applied to SVN, including additional tests.

Please let this JIRA know if it works for you.
                
> xsd:integers larger than java.long.MAX_VALUE silently overflow in ARQ
> ---------------------------------------------------------------------
>
>                 Key: JENA-153
>                 URL: https://issues.apache.org/jira/browse/JENA-153
>             Project: Jena
>          Issue Type: Bug
>          Components: ARQ, Jena
>            Reporter: Richard Cyganiak
>            Assignee: Andy Seaborne
>            Priority: Minor
>
> ARQ handles small xsd:integers fine, and it handles large xsd:integers fine, but there seems to be some weirdness going on with integers of ~20 digits:
> ASK {FILTER (200000/2=100000)} => true
> ASK {FILTER (20000000/2=10000000)} => true
> ASK {FILTER (2000000000/2=1000000000)} => true
> ASK {FILTER (200000000000/2=100000000000)} => true
> ASK {FILTER (20000000000000/2=10000000000000)} => true
> ASK {FILTER (2000000000000000/2=1000000000000000)} => true
> ASK {FILTER (200000000000000000/2=100000000000000000)} => true
> ASK {FILTER (20000000000000000000/2=10000000000000000000)} => ***false***
> ASK {FILTER (2000000000000000000000/2=1000000000000000000000)} => true
> ASK {FILTER (200000000000000000000000/2=100000000000000000000000)} => true
> ASK {FILTER (20000000000000000000000000/2=10000000000000000000000000)} => true
> These were all tested in http://sparql.org/sparql.html with an arbitrary target graph URI.
> It works fine again if dividend and quotient are changed to xsd:decimal:
> ASK {FILTER (20000000000000000000.0/2=10000000000000000000.0)} => true
> I guess this may have something to do with Java's native long being used for xsd:integers of some size, and BigInteger for others?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira