You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@rya.apache.org by "David W. Lotts (JIRA)" <ji...@apache.org> on 2017/02/06 17:20:41 UTC

[jira] [Commented] (RYA-43) NumberFormatException for large integers

    [ https://issues.apache.org/jira/browse/RYA-43?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15854404#comment-15854404 ] 

David W. Lotts commented on RYA-43:
-----------------------------------

Here is a reasonable (great?) SOLUTION! 
Making it backward compatible is why this has not been fixed yet.  This solution accomplishes backward compatible by mixing the encoding from LexiTypeEncoders.bigIntegerEncoder() with the existing LexiTypeEncoders.integerEncoder().  

Here is an idea to make this work backwardly compatible.  It should not break existing Rya repositories:
Encode the java sized integers as-is, then for anything out of range, use MAX/MIN Integer and concatenate the new big integer encoding.

Pros: Regular integers are unencumbered.
Cons:
The only disadvantage I see is that every large integer literal stored will have an extra 8 bytes.  

Here is the current way of encoding returning a string in class :   org.apache.rya.api.resolver.impl.IntegerRyaTypeResolver
            return INTEGER_STRING_TYPE_ENCODER.encode(Integer.parseInt(data));

Here is my replacement:

if  (value >= Integer.MAX) { //  value is a string, fix this with parseint() and catch or similar
    return INTEGER_STRING_TYPE_ENCODER.encode(Integer.MAX) +  LexiTypeEncoders.bigIntegerEncoder(value) ;
} else if (value <= Integer.MIN) {  // fix this also as above.
    return INTEGER_STRING_TYPE_ENCODER.encode(Integer.MIN)  + LexiTypeEncoders.bigIntegerEncoder(value) ;
} else {
            return INTEGER_STRING_TYPE_ENCODER.encode(Integer.parseInt(data));
}

That's it!
You need to figure out a good way to do the comparison before converting from a String.  Probably using the exception catch makes sense.  Also deserialize needs to be coded in reverse.

david.

> NumberFormatException for large integers
> ----------------------------------------
>
>                 Key: RYA-43
>                 URL: https://issues.apache.org/jira/browse/RYA-43
>             Project: Rya
>          Issue Type: Bug
>    Affects Versions: 3.2.10
>            Reporter: Jesse Hatfield
>            Assignee: David W. Lotts
>         Attachments: integer
>
>
> Attempting to insert a value with datatype {{xsd:integer}} and value outside the range of a Java int will fail with an exception.
> It looks like Rya resolves any {{xsd:integer}} as an int, whereas the [XMLSchema specification|https://www.w3.org/TR/xmlschema11-2/#integer] defines {{xsd:integer}} as the infinite set of all integers (with subsets {{xsd:long}} and {{xsd:int}} having bounded range). Therefore we fail to parse what should be a valid triple.
> Example input:
> {code}<http://dbpedia.org/resource/Pseudohypoaldosteronism> <http://dbpedia.org/ontology/omim> "9223372036854775807"^^<http://www.w3.org/2001/XMLSchema#integer> .{code}
> Result:
> {code}
> $ hadoop jar accumulo.rya-3.2.10-SNAPSHOT-shaded.jar mvm.rya.accumulo.mr.fileinput.RdfFileInputTool -conf conf.xml -Drdf.tablePrefix=int_bug_ -Drdf.format=N-Triples /input/integer.nt
> [...]
> Error: java.io.IOException: mvm.rya.api.resolver.triple.TripleRowResolverException: mvm.rya.api.resolver.RyaTypeResolverException: Exception occurred serializing data[9223372036854775807]
> 	at mvm.rya.accumulo.RyaTableMutationsFactory.serialize(RyaTableMutationsFactory.java:75)
> 	at mvm.rya.accumulo.mr.fileinput.RdfFileInputTool$StatementToMutationMapper.map(RdfFileInputTool.java:157)
> 	at mvm.rya.accumulo.mr.fileinput.RdfFileInputTool$StatementToMutationMapper.map(RdfFileInputTool.java:124)
> 	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> 	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:422)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> 	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: mvm.rya.api.resolver.triple.TripleRowResolverException: mvm.rya.api.resolver.RyaTypeResolverException: Exception occurred serializing data[9223372036854775807]
> 	at mvm.rya.api.resolver.triple.impl.WholeRowTripleResolver.serialize(WholeRowTripleResolver.java:82)
> 	at mvm.rya.api.resolver.RyaTripleContext.serializeTriple(RyaTripleContext.java:85)
> 	at mvm.rya.accumulo.RyaTableMutationsFactory.serialize(RyaTableMutationsFactory.java:67)
> 	... 10 more
> Caused by: mvm.rya.api.resolver.RyaTypeResolverException: Exception occurred serializing data[9223372036854775807]
> 	at mvm.rya.api.resolver.impl.IntegerRyaTypeResolver.serializeData(IntegerRyaTypeResolver.java:50)
> 	at mvm.rya.api.resolver.impl.RyaTypeResolverImpl.serializeType(RyaTypeResolverImpl.java:82)
> 	at mvm.rya.api.resolver.RyaContext.serializeType(RyaContext.java:121)
> 	at mvm.rya.api.resolver.triple.impl.WholeRowTripleResolver.serialize(WholeRowTripleResolver.java:64)
> 	... 12 more
> Caused by: java.lang.NumberFormatException: For input string: "9223372036854775807"
> 	at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
> 	at java.lang.Integer.parseInt(Integer.java:583)
> 	at java.lang.Integer.parseInt(Integer.java:615)
> 	at mvm.rya.api.resolver.impl.IntegerRyaTypeResolver.serializeData(IntegerRyaTypeResolver.java:48)
> 	... 15 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)