You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@rya.apache.org by "Ly, Kiet" <Ki...@finra.org> on 2017/01/10 18:47:33 UTC

rya parser failed on parsing large numeric type

RDF parser confused between large numeric data type with integer. Any work around for this?
This is a recent build from master branch 3.2.10 I think.

Caused by: java.lang.NumberFormatException: For input string: "-6703205597155942197"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:583)
at java.lang.Integer.parseInt(Integer.java:615)
at org.apache.rya.api.resolver.impl.IntegerRyaTypeResolver.serializeData(IntegerRyaTypeResolver.java:48)

Confidentiality Notice::  This email, including attachments, may include non-public, proprietary, confidential or legally privileged information.  If you are not an intended recipient or an authorized agent of an intended recipient, you are hereby notified that any dissemination, distribution or copying of the information contained in or transmitted with this e-mail is unauthorized and strictly prohibited.  If you have received this email in error, please notify the sender by replying to this message and permanently delete this e-mail, its attachments, and any copies of it immediately.  You should not retain, copy or use this e-mail or any attachment for any purpose, nor disclose all or any part of the contents to any other person. Thank you.

Re: rya parser failed on parsing large numeric type

Posted by David Lotts <dl...@gmail.com>.
> Good idea, but why can’t you use LongEncoder which is also 8 bytes?
It's great that it works for you!  It would probably work for lots of cases
since a long is large.
I am seeing 16 bytes.  Maybe we are looking at something different:
org.calrissian.mango.types.encoders.lexi.LongEncoder.class

However it does not meet the RDF specifications which I believe allow it to
be unlimited length.
I'm going to paste my solution above into RYA-43 in case someone wants to
implement it.

david.

On Wed, Jan 11, 2017 at 1:20 PM, Ly, Kiet <Ki...@finra.org> wrote:

> Good idea, but why can’t you use LongEncoder which is also 8 bytes?
>
> BigInteger.parseInt(array, start, end) required array conversion and extra
> start,end parameters.
>
>
>
> I replaced Integer to Long and it worked for me.
>
> public static final TypeEncoder<Long, String> INTEGER_STRING_TYPE_ENCODER
> = LexiTypeEncoders.longEncoder();
>
>
>
> On 1/11/17, 12:51 PM, "David Lotts" <dl...@gmail.com> wrote:
>
>
>
>     I had an idea to make this work backwardly compatible.  It should not
> break
>
>     existing Rya repositories:
>
>     Encode the java sized integers as-is, then for anything out of range,
> use
>
>     MAX/MIN and concatenate the new big integer encoding.
>
>
>
>     Here is the current way of encoding returning a string:
>
>                 return INTEGER_STRING_TYPE_ENCODER.
>
>     encode(Integer.parseInt(data));
>
>
>
>     Here is my replacement:
>
>
>
>     if  (value >= Integer.MAX) { //  value is a string, fix this with
>
>     parseint() and catch or similar
>
>         return INTEGER_STRING_TYPE_ENCODER.encode(Integer.MAX) +
>
>     bigIntegerEncode(value) ;
>
>     } else if (value <= Integer.MIN) {
>
>         return INTEGER_STRING_TYPE_ENCODER.encode(Integer.MIN)  +
>
>     bigIntegerEncode(value) ;
>
>     } else {
>
>                 return INTEGER_STRING_TYPE_ENCODER.
>
>     encode(Integer.parseInt(data));
>
>     }
>
>
>
>     The only disadvantage I see is that every big integer literal that you
>
>     store will have an extra 8 bytes.  Regular integers are unencumbered.
>
>
>
>     david.
>
>
>
>     On Tue, Jan 10, 2017 at 3:32 PM, David Lotts <dl...@gmail.com> wrote:
>
>
>
>     > > RDF parser confused between large numeric data type with integer.
> Any
>
>     > work around for this?  This is a recent build from master branch
> 3.2.10 I
>
>     > think.
>
>     >
>
>     > This issue is reported here: https://issues.apache.org/
> jira/browse/RYA-43
>
>     >
>
>     > This has a mechanically simple fix, but it breaks existing
>
>     > implementations.  So making it backward compatible is probably why
> it has
>
>     > not been done yet.  Backward compatible can be done by designing a
> way to
>
>     > mix the encoding from LexiTypeEncoders.bigIntegerEncoder() with the
>
>     > existing LexiTypeEncoders.integerEncoder().  Or create a new Lexical
>
>     > encoder, that handles both, or maybe upgrade utility to modify the
> data.
>
>     >
>
>     > So you could do this as a work around in your own code if you have
> luxury
>
>     > of starting from an empty Rya repo:
>
>     >
>
>     > Copy /rya.api/src/main/java/org/apache/rya/api/resolver/impl/
> IntegerRyaTypeResolver.java
>
>     > to LittleIntegerRyaTypeResolver.java
>
>     >
>
>     > Then, in IntegerRyaTypeResolver.java, where-ever it uses a Integer,
>
>     > replace with a BigInteger.
>
>     > Replace LexiTypeEncoders.integerEncoder()  with LexiTypeEncoders.
>
>     > bigIntegerEncoder()
>
>     >     https://github.com/calrissian/mango/blob/master/mango-core/
>
>     > src/main/java/org/calrissian/mango/types/encoders/lexi/
>
>     > BigIntegerEncoder.java
>
>     >
>
>     > Test and done!
>
>     >
>
>     > david.
>
>     >
>
>
>
> Confidentiality Notice::  This email, including attachments, may include
> non-public, proprietary, confidential or legally privileged information.
> If you are not an intended recipient or an authorized agent of an intended
> recipient, you are hereby notified that any dissemination, distribution or
> copying of the information contained in or transmitted with this e-mail is
> unauthorized and strictly prohibited.  If you have received this email in
> error, please notify the sender by replying to this message and permanently
> delete this e-mail, its attachments, and any copies of it immediately.  You
> should not retain, copy or use this e-mail or any attachment for any
> purpose, nor disclose all or any part of the contents to any other person.
> Thank you.
>

Re: rya parser failed on parsing large numeric type

Posted by "Ly, Kiet" <Ki...@finra.org>.
Good idea, but why can’t you use LongEncoder which is also 8 bytes?

BigInteger.parseInt(array, start, end) required array conversion and extra start,end parameters.



I replaced Integer to Long and it worked for me.

public static final TypeEncoder<Long, String> INTEGER_STRING_TYPE_ENCODER = LexiTypeEncoders.longEncoder();



On 1/11/17, 12:51 PM, "David Lotts" <dl...@gmail.com> wrote:



    I had an idea to make this work backwardly compatible.  It should not break

    existing Rya repositories:

    Encode the java sized integers as-is, then for anything out of range, use

    MAX/MIN and concatenate the new big integer encoding.



    Here is the current way of encoding returning a string:

                return INTEGER_STRING_TYPE_ENCODER.

    encode(Integer.parseInt(data));



    Here is my replacement:



    if  (value >= Integer.MAX) { //  value is a string, fix this with

    parseint() and catch or similar

        return INTEGER_STRING_TYPE_ENCODER.encode(Integer.MAX) +

    bigIntegerEncode(value) ;

    } else if (value <= Integer.MIN) {

        return INTEGER_STRING_TYPE_ENCODER.encode(Integer.MIN)  +

    bigIntegerEncode(value) ;

    } else {

                return INTEGER_STRING_TYPE_ENCODER.

    encode(Integer.parseInt(data));

    }



    The only disadvantage I see is that every big integer literal that you

    store will have an extra 8 bytes.  Regular integers are unencumbered.



    david.



    On Tue, Jan 10, 2017 at 3:32 PM, David Lotts <dl...@gmail.com> wrote:



    > > RDF parser confused between large numeric data type with integer. Any

    > work around for this?  This is a recent build from master branch 3.2.10 I

    > think.

    >

    > This issue is reported here: https://issues.apache.org/jira/browse/RYA-43

    >

    > This has a mechanically simple fix, but it breaks existing

    > implementations.  So making it backward compatible is probably why it has

    > not been done yet.  Backward compatible can be done by designing a way to

    > mix the encoding from LexiTypeEncoders.bigIntegerEncoder() with the

    > existing LexiTypeEncoders.integerEncoder().  Or create a new Lexical

    > encoder, that handles both, or maybe upgrade utility to modify the data.

    >

    > So you could do this as a work around in your own code if you have luxury

    > of starting from an empty Rya repo:

    >

    > Copy /rya.api/src/main/java/org/apache/rya/api/resolver/impl/IntegerRyaTypeResolver.java

    > to LittleIntegerRyaTypeResolver.java

    >

    > Then, in IntegerRyaTypeResolver.java, where-ever it uses a Integer,

    > replace with a BigInteger.

    > Replace LexiTypeEncoders.integerEncoder()  with LexiTypeEncoders.

    > bigIntegerEncoder()

    >     https://github.com/calrissian/mango/blob/master/mango-core/

    > src/main/java/org/calrissian/mango/types/encoders/lexi/

    > BigIntegerEncoder.java

    >

    > Test and done!

    >

    > david.

    >



Confidentiality Notice::  This email, including attachments, may include non-public, proprietary, confidential or legally privileged information.  If you are not an intended recipient or an authorized agent of an intended recipient, you are hereby notified that any dissemination, distribution or copying of the information contained in or transmitted with this e-mail is unauthorized and strictly prohibited.  If you have received this email in error, please notify the sender by replying to this message and permanently delete this e-mail, its attachments, and any copies of it immediately.  You should not retain, copy or use this e-mail or any attachment for any purpose, nor disclose all or any part of the contents to any other person. Thank you.

Re: rya parser failed on parsing large numeric type

Posted by David Lotts <dl...@gmail.com>.
I had an idea to make this work backwardly compatible.  It should not break
existing Rya repositories:
Encode the java sized integers as-is, then for anything out of range, use
MAX/MIN and concatenate the new big integer encoding.

Here is the current way of encoding returning a string:
            return INTEGER_STRING_TYPE_ENCODER.
encode(Integer.parseInt(data));

Here is my replacement:

if  (value >= Integer.MAX) { //  value is a string, fix this with
parseint() and catch or similar
    return INTEGER_STRING_TYPE_ENCODER.encode(Integer.MAX) +
bigIntegerEncode(value) ;
} else if (value <= Integer.MIN) {
    return INTEGER_STRING_TYPE_ENCODER.encode(Integer.MIN)  +
bigIntegerEncode(value) ;
} else {
            return INTEGER_STRING_TYPE_ENCODER.
encode(Integer.parseInt(data));
}

The only disadvantage I see is that every big integer literal that you
store will have an extra 8 bytes.  Regular integers are unencumbered.

david.

On Tue, Jan 10, 2017 at 3:32 PM, David Lotts <dl...@gmail.com> wrote:

> > RDF parser confused between large numeric data type with integer. Any
> work around for this?  This is a recent build from master branch 3.2.10 I
> think.
>
> This issue is reported here: https://issues.apache.org/jira/browse/RYA-43
>
> This has a mechanically simple fix, but it breaks existing
> implementations.  So making it backward compatible is probably why it has
> not been done yet.  Backward compatible can be done by designing a way to
> mix the encoding from LexiTypeEncoders.bigIntegerEncoder() with the
> existing LexiTypeEncoders.integerEncoder().  Or create a new Lexical
> encoder, that handles both, or maybe upgrade utility to modify the data.
>
> So you could do this as a work around in your own code if you have luxury
> of starting from an empty Rya repo:
>
> Copy /rya.api/src/main/java/org/apache/rya/api/resolver/impl/IntegerRyaTypeResolver.java
> to LittleIntegerRyaTypeResolver.java
>
> Then, in IntegerRyaTypeResolver.java, where-ever it uses a Integer,
> replace with a BigInteger.
> Replace LexiTypeEncoders.integerEncoder()  with LexiTypeEncoders.
> bigIntegerEncoder()
>     https://github.com/calrissian/mango/blob/master/mango-core/
> src/main/java/org/calrissian/mango/types/encoders/lexi/
> BigIntegerEncoder.java
>
> Test and done!
>
> david.
>

Re: rya parser failed on parsing large numeric type

Posted by David Lotts <dl...@gmail.com>.
> RDF parser confused between large numeric data type with integer. Any
work around for this?  This is a recent build from master branch 3.2.10 I
think.

This issue is reported here: https://issues.apache.org/jira/browse/RYA-43

This has a mechanically simple fix, but it breaks existing
implementations.  So making it backward compatible is probably why it has
not been done yet.  Backward compatible can be done by designing a way to
mix the encoding from LexiTypeEncoders.bigIntegerEncoder() with the
existing LexiTypeEncoders.integerEncoder().  Or create a new Lexical
encoder, that handles both, or maybe upgrade utility to modify the data.

So you could do this as a work around in your own code if you have luxury
of starting from an empty Rya repo:

Copy
/rya.api/src/main/java/org/apache/rya/api/resolver/impl/IntegerRyaTypeResolver.java
to LittleIntegerRyaTypeResolver.java

Then, in IntegerRyaTypeResolver.java, where-ever it uses a Integer, replace
with a BigInteger.
Replace LexiTypeEncoders.integerEncoder()  with
LexiTypeEncoders.bigIntegerEncoder()

https://github.com/calrissian/mango/blob/master/mango-core/src/main/java/org/calrissian/mango/types/encoders/lexi/BigIntegerEncoder.java

Test and done!

david.