You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "clesaec (via GitHub)" <gi...@apache.org> on 2023/06/21 10:01:32 UTC

[GitHub] [avro] clesaec opened a new pull request, #2302: AVRO-3779: using rust bigdecimal

clesaec opened a new pull request, #2302:
URL: https://github.com/apache/avro/pull/2302

   <!--
   
   *Thank you very much for contributing to Apache Avro - we are happy that you want to help us improve Avro. To help the community review your contribution in the best possible way, please go through the checklist below, which will get the contribution into a shape in which it can be best reviewed.*
   
   *Please understand that we do not do this to make contributions to Avro a hassle. In order to uphold a high standard of quality for code contributions, while at the same time managing a large number of contributions, we need contributors to prepare the contributions well, and give reviewers enough contextual information for the review. Please also understand that contributions that do not follow this guide will take longer to review and thus typically be picked up with lower priority by the community.*
   
   ## Contribution Checklist
   
     - Make sure that the pull request corresponds to a [JIRA issue](https://issues.apache.org/jira/projects/AVRO/issues). Exceptions are made for typos in JavaDoc or documentation files, which need no JIRA issue.
     
     - Name the pull request in the form "AVRO-XXXX: [component] Title of the pull request", where *AVRO-XXXX* should be replaced by the actual issue number. 
       The *component* is optional, but can help identify the correct reviewers faster: either the language ("java", "python") or subsystem such as "build" or "doc" are good candidates.  
   
     - Fill out the template below to describe the changes contributed by the pull request. That will give reviewers the context they need to do the review.
     
     - Make sure that the change passes the automated tests. You can [build the entire project](https://github.com/apache/avro/blob/master/BUILD.md) or just the [language-specific SDK](https://avro.apache.org/project/how-to-contribute/#unit-tests).
   
     - Each pull request should address only one issue, not mix up code from multiple issues.
     
     - Each commit in the pull request has a meaningful commit message (including the JIRA id)
   
     - Every commit message references Jira issues in their subject lines. In addition, commits follow the guidelines from [How to write a good git commit message](https://chris.beams.io/posts/git-commit/)
       1. Subject is separated from body by a blank line
       1. Subject is limited to 50 characters (not including Jira issue reference)
       1. Subject does not end with a period
       1. Subject uses the imperative mood ("add", not "adding")
       1. Body wraps at 72 characters
       1. Body explains "what" and "why", not "how"
   
   -->
   
   ## What is the purpose of the change
   
   [AVRO-3779](https://issues.apache.org/jira/browse/AVRO-3779) : the idea is to propose a big-decimal type where scale is stored inside value, allowing flexible big decimal field in schema (as there is no need to anticipate scale, nor precision).
   
   This draft PR is related to equivalent [for Java](https://github.com/apache/avro/pull/2282).
   
   ## Verifying this change
   
   This change added unit tests and can be verified by running it
   Still need to check if serialization is same as Java (or change);
   
   ## Documentation
   
   - Does this pull request introduce a new feature? (yes)
   - If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] clesaec commented on pull request #2302: AVRO-3779: using rust bigdecimal

Posted by "clesaec (via GitHub)" <gi...@apache.org>.
clesaec commented on PR #2302:
URL: https://github.com/apache/avro/pull/2302#issuecomment-1604043377

   @martin-g : thanks for the review, fixes and relevant comments. I indeed change code (serialize_big_decimal especially :smile:)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] AVRO-3779: using rust bigdecimal [avro]

Posted by "clesaec (via GitHub)" <gi...@apache.org>.
clesaec merged PR #2302:
URL: https://github.com/apache/avro/pull/2302


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] AVRO-3779: using rust bigdecimal [avro]

Posted by "clesaec (via GitHub)" <gi...@apache.org>.
clesaec commented on PR #2302:
URL: https://github.com/apache/avro/pull/2302#issuecomment-1819105724

   Could you be more precise @Ten0 , because this code is already on main branch for next 1.12 version. So, if i can add some tests and/or some fixes, the sooner the better !!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] AVRO-3779: using rust bigdecimal [avro]

Posted by "ftelnov (via GitHub)" <gi...@apache.org>.
ftelnov commented on PR #2302:
URL: https://github.com/apache/avro/pull/2302#issuecomment-1752905271

   You have a one approval from @opwvhk on Java PR. I don't know whether he is a maintainer. Even then I would still wait for another approval from @martin-g, as change is significant in my opinion - you've introduced new type, which is only presented in small range of languages.
   
   Moreover, Rust PR is not finished - you've pointed out that bigdecimal is only available on Java, which is not the case after this PR is merged.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] AVRO-3779: using rust bigdecimal [avro]

Posted by "clesaec (via GitHub)" <gi...@apache.org>.
clesaec commented on PR #2302:
URL: https://github.com/apache/avro/pull/2302#issuecomment-1752884353

   Discussions were done arround [Java equivalent PR](https://github.com/apache/avro/pull/2282) and this [discussion thread](https://lists.apache.org/thread/068z12388smjgpkbxz4yy2772pd2dgvf) on avro mail.
   So, i thought it was ok. I can revert this and Java one if not ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] AVRO-3779: using rust bigdecimal [avro]

Posted by "martin-g (via GitHub)" <gi...@apache.org>.
martin-g commented on PR #2302:
URL: https://github.com/apache/avro/pull/2302#issuecomment-1756145091

   > So, i consider it's ok, @martin-g ?
   
   All is good!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] AVRO-3779: using rust bigdecimal [avro]

Posted by "Ten0 (via GitHub)" <gi...@apache.org>.
Ten0 commented on PR #2302:
URL: https://github.com/apache/avro/pull/2302#issuecomment-1817587844

   Looks like it's missing a round trip test. Looking at the code it shouldn't pass because serialize and deserialize implementations seem incoherent.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] martin-g commented on a diff in pull request #2302: AVRO-3779: using rust bigdecimal

Posted by "martin-g (via GitHub)" <gi...@apache.org>.
martin-g commented on code in PR #2302:
URL: https://github.com/apache/avro/pull/2302#discussion_r1239531016


##########
lang/rust/avro/src/decimal.rs:
##########
@@ -99,12 +104,50 @@ impl<T: AsRef<[u8]>> From<T> for Decimal {
     }
 }
 
+pub(crate) fn serialize_big_decimal(bg: &BigDecimal) -> Result<Vec<u8>, Error> {

Review Comment:
   This method never fails. Do we need the `Result` ?



##########
lang/rust/avro/src/decimal.rs:
##########
@@ -99,12 +104,50 @@ impl<T: AsRef<[u8]>> From<T> for Decimal {
     }
 }
 
+pub(crate) fn serialize_big_decimal(bg: &BigDecimal) -> Result<Vec<u8>, Error> {
+    let mut buffer: Vec<u8> = Vec::new();
+    let big_dec_inner: (BigInt, i64) = bg.as_bigint_and_exponent();

Review Comment:
   `let (big_int, exponent): (BigInt, i64) = decimal.as_bigint_and_exponent();`



##########
lang/rust/avro/src/encode.rs:
##########
@@ -114,6 +115,12 @@ pub(crate) fn encode_internal<S: Borrow<Schema>>(
             &uuid.to_string(),
             buffer,
         ),
+        Value::BigDecimal(bg) => {
+            let result: Result<Vec<u8>, Error> = serialize_big_decimal(bg);
+            if let Ok(mut buf) = result {
+                buffer.append(&mut buf);
+            }

Review Comment:
   I think we don't need the Result. Or we must not ignore the error here.



##########
lang/rust/avro/src/decimal.rs:
##########
@@ -99,12 +104,50 @@ impl<T: AsRef<[u8]>> From<T> for Decimal {
     }
 }
 
+pub(crate) fn serialize_big_decimal(bg: &BigDecimal) -> Result<Vec<u8>, Error> {
+    let mut buffer: Vec<u8> = Vec::new();
+    let big_dec_inner: (BigInt, i64) = bg.as_bigint_and_exponent();
+
+    let big_endian_value: Vec<u8> = big_dec_inner.0.to_signed_bytes_be();
+    encode_bytes(&big_endian_value, &mut buffer);
+    encode_long(big_dec_inner.1, &mut buffer);
+
+    Ok(buffer)
+}
+
+pub(crate) fn deserialize_big_decimal(stream: &Vec<u8>) -> Result<BigDecimal, Error> {
+    let mut x1: &[u8] = stream.as_slice();
+    let result: AvroResult<usize> = decode_len(&mut x1);
+
+    if result.is_err() {
+        return Err(Error::BigDecimalSign);

Review Comment:
   Is this the correct error ?
   Asking because it is reused few more times below



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] AVRO-3779: using rust bigdecimal [avro]

Posted by "clesaec (via GitHub)" <gi...@apache.org>.
clesaec commented on PR #2302:
URL: https://github.com/apache/avro/pull/2302#issuecomment-1820569743

   Round trip test is already done in "decimal.rs" source, line 180;
   by calling `serialize_big_decimal`, then `deserialize_big_decimal` on the Result; and check final result is equals; all this in a loop of 180 different values. 
   Did i miss something in the test ?
   
   ```    #[test]
       fn test_avro_3779_bigdecimal_serial() -> TestResult {
           let value: bigdecimal::BigDecimal =
               bigdecimal::BigDecimal::from(-1421).div(bigdecimal::BigDecimal::from(2));
           let mut current: bigdecimal::BigDecimal = bigdecimal::BigDecimal::one();
   
           for iter in 1..180 {
               let result: Vec<u8> = serialize_big_decimal(&current);
   
               let deserialize_big_decimal: Result<bigdecimal::BigDecimal, Error> =
                   deserialize_big_decimal(&result);
               assert!(
                   deserialize_big_decimal.is_ok(),
                   "can't deserialize for iter {iter}"
               );
               assert_eq!(
                   current,
                   deserialize_big_decimal.unwrap(),
                   "not equals for ${iter}"
               );
               current = current.mul(&value);
           }```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] AVRO-3779: using rust bigdecimal [avro]

Posted by "Ten0 (via GitHub)" <gi...@apache.org>.
Ten0 commented on PR #2302:
URL: https://github.com/apache/avro/pull/2302#issuecomment-1820649638

   Ah I missed that one, I was looking for another kind of test in this PR: this one tests the two functions but doesn't test what is actually serialized and deserialized using a Schema and Value, which leaves some room for inconsistency: outside of these function calls. (so indeed it seems not any round trip test does show it) 
   https://github.com/apache/avro/blob/main/lang/rust/avro/tests/io.rs
   Unless I'm mistaken the Bytes len itself is never serialized but is deserialized. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] clesaec commented on pull request #2302: AVRO-3779: using rust bigdecimal

Posted by "clesaec (via GitHub)" <gi...@apache.org>.
clesaec commented on PR #2302:
URL: https://github.com/apache/avro/pull/2302#issuecomment-1602841760

   As i build a PR in Java for this new logical type (big-decimal without predefining scale nor precision), for other language, it's necessary that an avro file build with Java avro can be read with Rust avro, and vice-versa.
   (And as this is a new logical type, there is no "AVRO specs" about it yet)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] clesaec commented on pull request #2302: AVRO-3779: using rust bigdecimal

Posted by "clesaec (via GitHub)" <gi...@apache.org>.
clesaec commented on PR #2302:
URL: https://github.com/apache/avro/pull/2302#issuecomment-1604131786

   :+1: 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] AVRO-3779: using rust bigdecimal [avro]

Posted by "Ten0 (via GitHub)" <gi...@apache.org>.
Ten0 commented on PR #2302:
URL: https://github.com/apache/avro/pull/2302#issuecomment-1819107388

   Any round trip test should do 😉


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] ftelnov commented on pull request #2302: AVRO-3779: using rust bigdecimal

Posted by "ftelnov (via GitHub)" <gi...@apache.org>.
ftelnov commented on PR #2302:
URL: https://github.com/apache/avro/pull/2302#issuecomment-1602587625

   Why do you think that the serialization must be the same java use? It must satisfy the AVRO specs, which is signed big-endian format.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] AVRO-3779: using rust bigdecimal [avro]

Posted by "ftelnov (via GitHub)" <gi...@apache.org>.
ftelnov commented on PR #2302:
URL: https://github.com/apache/avro/pull/2302#issuecomment-1752852481

   Why did you merge this? Did you discuss adding such new functionality into the avro specs? I'm asking cuz approval is not placed here, nor topic is discussed in the issue. To me it seems like out of spec thing, which is not well-documented anywhere else


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] AVRO-3779: using rust bigdecimal [avro]

Posted by "clesaec (via GitHub)" <gi...@apache.org>.
clesaec commented on PR #2302:
URL: https://github.com/apache/avro/pull/2302#issuecomment-1752931288

   Martin G only wrote 
   ```
   I am OK with this approach !
   Do you think it would be a good idea to mark this new logical type as
   experimental in the specification ?
   ```
   So, i consider it's ok, @martin-g ?
   @opwvhk is an Avro commiter.
   
   _Only presented in small range of languages_ Yes, that's often the case for new features (we can't add it for all languages in one shot). I just check it's possible in C#, and, normally in every languages where Decimal logical type is present.
   
   _only available on Java_ Indeed, may be i should build another PR to fix this documentation issue ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org