You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@clerezza.apache.org by "Rupert Westenthaler (Created) (JIRA)" <ji...@apache.org> on 2012/01/09 18:38:39 UTC

[jira] [Created] (CLEREZZA-670) Integer MAX/MIN value overflow in Comparator of the RDF JSON serializer

Integer MAX/MIN value overflow in Comparator of the RDF JSON serializer
-----------------------------------------------------------------------

                 Key: CLEREZZA-670
                 URL: https://issues.apache.org/jira/browse/CLEREZZA-670
             Project: Clerezza
          Issue Type: Bug
          Components: rdf.serialize
            Reporter: Rupert Westenthaler


The implementation of the SUBJECT_COMPARATOR within the RdfJsonSerializingProvider sometimes encounters an int MAX/MIN value overflow.
In such cases the compare method erroneously returns an 

 * negative value - on a max value overflow or an
 * positive value - on a min value overflow.

what causes the Triple array used for the serialization not being correctly sorted. In such cases subjects do appear multiple times within the generated json output.

To solve this one needs to replace the substraction of "hashA" from "hashB" with a boolean check that returns a -1/+1.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CLEREZZA-670) Integer MAX/MIN value overflow in Comparator of the RDF JSON serializer

Posted by "Rupert Westenthaler (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CLEREZZA-670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183189#comment-13183189 ] 

Rupert Westenthaler commented on CLEREZZA-670:
----------------------------------------------

Hi all, there seem to be additional issues. I will do some additional debugging and provide further updates.
                
> Integer MAX/MIN value overflow in Comparator of the RDF JSON serializer
> -----------------------------------------------------------------------
>
>                 Key: CLEREZZA-670
>                 URL: https://issues.apache.org/jira/browse/CLEREZZA-670
>             Project: Clerezza
>          Issue Type: Bug
>          Components: rdf.serialize
>            Reporter: Rupert Westenthaler
>              Labels: rdf/json
>         Attachments: CLEREZZA-670_rdf.rdfjson.int_value_overflow_in_subject_comparator
>
>
> The implementation of the SUBJECT_COMPARATOR within the RdfJsonSerializingProvider sometimes encounters an int MAX/MIN value overflow.
> In such cases the compare method erroneously returns an 
>  * negative value - on a max value overflow or an
>  * positive value - on a min value overflow.
> what causes the Triple array used for the serialization not being correctly sorted. In such cases subjects do appear multiple times within the generated json output.
> To solve this one needs to replace the substraction of "hashA" from "hashB" with a boolean check that returns a -1/+1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CLEREZZA-670) Integer MAX/MIN value overflow in Comparator of the RDF JSON serializer

Posted by "Daniel Spicar (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CLEREZZA-670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191204#comment-13191204 ] 

Daniel Spicar commented on CLEREZZA-670:
----------------------------------------

Hi Rupert,

Thank you again for the patch and sorry for the delay. I checked it out and it seems to be good. However you did not grant Apache the right for inclusion for the second patch. From the context I assume it's just an oversight. I am not an expert on this legal stuff but in order to be cautious I won't be able to resolve this issue until you grant us the right for inclusion. So please drop a line here.
                
> Integer MAX/MIN value overflow in Comparator of the RDF JSON serializer
> -----------------------------------------------------------------------
>
>                 Key: CLEREZZA-670
>                 URL: https://issues.apache.org/jira/browse/CLEREZZA-670
>             Project: Clerezza
>          Issue Type: Bug
>          Components: rdf.serialize
>            Reporter: Rupert Westenthaler
>            Assignee: Daniel Spicar
>              Labels: rdf/json
>         Attachments: CLEREZZA-670_rdf.rdfjson.int_value_overflow_and_wrong_datatypes_in_parser, CLEREZZA-670_rdf.rdfjson.int_value_overflow_in_subject_comparator
>
>
> The implementation of the SUBJECT_COMPARATOR within the RdfJsonSerializingProvider sometimes encounters an int MAX/MIN value overflow.
> In such cases the compare method erroneously returns an 
>  * negative value - on a max value overflow or an
>  * positive value - on a min value overflow.
> what causes the Triple array used for the serialization not being correctly sorted. In such cases subjects do appear multiple times within the generated json output.
> To solve this one needs to replace the substraction of "hashA" from "hashB" with a boolean check that returns a -1/+1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CLEREZZA-670) Integer MAX/MIN value overflow in Comparator of the RDF JSON serializer

Posted by "Rupert Westenthaler (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CLEREZZA-670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rupert Westenthaler updated CLEREZZA-670:
-----------------------------------------

    Attachment: CLEREZZA-670_rdf.rdfjson.int_value_overflow_in_subject_comparator

This patch replaces the substraction of the two hashes with a boolean check and therefore avoids interger range overflows.

In addition it enables the unit test for big RDF Graph and adds an assertion that checks for the number of serialized triples is equals to the number of parsed. This check was very useful to find this bug. 
                
> Integer MAX/MIN value overflow in Comparator of the RDF JSON serializer
> -----------------------------------------------------------------------
>
>                 Key: CLEREZZA-670
>                 URL: https://issues.apache.org/jira/browse/CLEREZZA-670
>             Project: Clerezza
>          Issue Type: Bug
>          Components: rdf.serialize
>            Reporter: Rupert Westenthaler
>              Labels: rdf/json
>         Attachments: CLEREZZA-670_rdf.rdfjson.int_value_overflow_in_subject_comparator
>
>
> The implementation of the SUBJECT_COMPARATOR within the RdfJsonSerializingProvider sometimes encounters an int MAX/MIN value overflow.
> In such cases the compare method erroneously returns an 
>  * negative value - on a max value overflow or an
>  * positive value - on a min value overflow.
> what causes the Triple array used for the serialization not being correctly sorted. In such cases subjects do appear multiple times within the generated json output.
> To solve this one needs to replace the substraction of "hashA" from "hashB" with a boolean check that returns a -1/+1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Assigned] (CLEREZZA-670) Integer MAX/MIN value overflow in Comparator of the RDF JSON serializer

Posted by "Daniel Spicar (Assigned) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CLEREZZA-670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Spicar reassigned CLEREZZA-670:
--------------------------------------

    Assignee: Daniel Spicar
    
> Integer MAX/MIN value overflow in Comparator of the RDF JSON serializer
> -----------------------------------------------------------------------
>
>                 Key: CLEREZZA-670
>                 URL: https://issues.apache.org/jira/browse/CLEREZZA-670
>             Project: Clerezza
>          Issue Type: Bug
>          Components: rdf.serialize
>            Reporter: Rupert Westenthaler
>            Assignee: Daniel Spicar
>              Labels: rdf/json
>         Attachments: CLEREZZA-670_rdf.rdfjson.int_value_overflow_and_wrong_datatypes_in_parser, CLEREZZA-670_rdf.rdfjson.int_value_overflow_in_subject_comparator
>
>
> The implementation of the SUBJECT_COMPARATOR within the RdfJsonSerializingProvider sometimes encounters an int MAX/MIN value overflow.
> In such cases the compare method erroneously returns an 
>  * negative value - on a max value overflow or an
>  * positive value - on a min value overflow.
> what causes the Triple array used for the serialization not being correctly sorted. In such cases subjects do appear multiple times within the generated json output.
> To solve this one needs to replace the substraction of "hashA" from "hashB" with a boolean check that returns a -1/+1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CLEREZZA-670) Integer MAX/MIN value overflow in Comparator of the RDF JSON serializer

Posted by "Rupert Westenthaler (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CLEREZZA-670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191257#comment-13191257 ] 

Rupert Westenthaler commented on CLEREZZA-670:
----------------------------------------------

Hi Daniel

You are right it was indeed an oversight from me. I can confirm that I do grant also the second patch under the Apache License, Version 2.0 [1]

best
Rupert Westenthaler


[1] http://www.apache.org/licenses/LICENSE-2.0.html 
                
> Integer MAX/MIN value overflow in Comparator of the RDF JSON serializer
> -----------------------------------------------------------------------
>
>                 Key: CLEREZZA-670
>                 URL: https://issues.apache.org/jira/browse/CLEREZZA-670
>             Project: Clerezza
>          Issue Type: Bug
>          Components: rdf.serialize
>            Reporter: Rupert Westenthaler
>            Assignee: Daniel Spicar
>              Labels: rdf/json
>         Attachments: CLEREZZA-670_rdf.rdfjson.int_value_overflow_and_wrong_datatypes_in_parser, CLEREZZA-670_rdf.rdfjson.int_value_overflow_in_subject_comparator
>
>
> The implementation of the SUBJECT_COMPARATOR within the RdfJsonSerializingProvider sometimes encounters an int MAX/MIN value overflow.
> In such cases the compare method erroneously returns an 
>  * negative value - on a max value overflow or an
>  * positive value - on a min value overflow.
> what causes the Triple array used for the serialization not being correctly sorted. In such cases subjects do appear multiple times within the generated json output.
> To solve this one needs to replace the substraction of "hashA" from "hashB" with a boolean check that returns a -1/+1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CLEREZZA-670) Integer MAX/MIN value overflow in Comparator of the RDF JSON serializer

Posted by "Rupert Westenthaler (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CLEREZZA-670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rupert Westenthaler updated CLEREZZA-670:
-----------------------------------------

    Attachment: CLEREZZA-670_rdf.rdfjson.int_value_overflow_and_wrong_datatypes_in_parser

After I added the equals assertion in the 

    RdfJsonSerializerProviderTest#testBigGraph()

I noticed that it still fails.

However after some debugging I found the error to be within the Parser.

When parsing typed literals such as

"http://www.w3.org/2006/vcard/ns#longitude": [
        {
          "value": "12.8407428",
          "type": "literal",
          "datatype": "http://www.w3.org/2001/XMLSchema#dateTime#float"
        }
      ]

the parser used the LiteralFactory to create a TypedLiteral based on the value.
This had the Result that all typed literal had the type xsd:string because the
"value" always returned a String.

The implementation included in this patch directly constructs a TypedLiteralImpl with the value and the datatype. 

In addition this new Patch updates the unit test mentioned above to make an
equals test between the serialized and parsed graph. In order to keep the
computation time of the equals check within some seconds I had to reduce the size of the graph to 5k triples.

This Patch replaces the first one

best
Rupert

                
> Integer MAX/MIN value overflow in Comparator of the RDF JSON serializer
> -----------------------------------------------------------------------
>
>                 Key: CLEREZZA-670
>                 URL: https://issues.apache.org/jira/browse/CLEREZZA-670
>             Project: Clerezza
>          Issue Type: Bug
>          Components: rdf.serialize
>            Reporter: Rupert Westenthaler
>              Labels: rdf/json
>         Attachments: CLEREZZA-670_rdf.rdfjson.int_value_overflow_and_wrong_datatypes_in_parser, CLEREZZA-670_rdf.rdfjson.int_value_overflow_in_subject_comparator
>
>
> The implementation of the SUBJECT_COMPARATOR within the RdfJsonSerializingProvider sometimes encounters an int MAX/MIN value overflow.
> In such cases the compare method erroneously returns an 
>  * negative value - on a max value overflow or an
>  * positive value - on a min value overflow.
> what causes the Triple array used for the serialization not being correctly sorted. In such cases subjects do appear multiple times within the generated json output.
> To solve this one needs to replace the substraction of "hashA" from "hashB" with a boolean check that returns a -1/+1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (CLEREZZA-670) Integer MAX/MIN value overflow in Comparator of the RDF JSON serializer

Posted by "Daniel Spicar (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CLEREZZA-670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Spicar resolved CLEREZZA-670.
------------------------------------

    Resolution: Fixed

Thanks Rupert
                
> Integer MAX/MIN value overflow in Comparator of the RDF JSON serializer
> -----------------------------------------------------------------------
>
>                 Key: CLEREZZA-670
>                 URL: https://issues.apache.org/jira/browse/CLEREZZA-670
>             Project: Clerezza
>          Issue Type: Bug
>          Components: rdf.serialize
>            Reporter: Rupert Westenthaler
>            Assignee: Daniel Spicar
>              Labels: rdf/json
>         Attachments: CLEREZZA-670_rdf.rdfjson.int_value_overflow_and_wrong_datatypes_in_parser, CLEREZZA-670_rdf.rdfjson.int_value_overflow_in_subject_comparator
>
>
> The implementation of the SUBJECT_COMPARATOR within the RdfJsonSerializingProvider sometimes encounters an int MAX/MIN value overflow.
> In such cases the compare method erroneously returns an 
>  * negative value - on a max value overflow or an
>  * positive value - on a min value overflow.
> what causes the Triple array used for the serialization not being correctly sorted. In such cases subjects do appear multiple times within the generated json output.
> To solve this one needs to replace the substraction of "hashA" from "hashB" with a boolean check that returns a -1/+1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira