You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by GitBox <gi...@apache.org> on 2022/12/15 14:03:39 UTC

[GitHub] [avro] martin-g opened a new pull request, #2014: Avro 3683 multiple schemas

martin-g opened a new pull request, #2014:
URL: https://github.com/apache/avro/pull/2014

   **WIP**
   
   <!--
   
   *Thank you very much for contributing to Apache Avro - we are happy that you want to help us improve Avro. To help the community review your contribution in the best possible way, please go through the checklist below, which will get the contribution into a shape in which it can be best reviewed.*
   
   *Please understand that we do not do this to make contributions to Avro a hassle. In order to uphold a high standard of quality for code contributions, while at the same time managing a large number of contributions, we need contributors to prepare the contributions well, and give reviewers enough contextual information for the review. Please also understand that contributions that do not follow this guide will take longer to review and thus typically be picked up with lower priority by the community.*
   
   ## Contribution Checklist
   
     - Make sure that the pull request corresponds to a [JIRA issue](https://issues.apache.org/jira/projects/AVRO/issues). Exceptions are made for typos in JavaDoc or documentation files, which need no JIRA issue.
     
     - Name the pull request in the form "AVRO-XXXX: [component] Title of the pull request", where *AVRO-XXXX* should be replaced by the actual issue number. 
       The *component* is optional, but can help identify the correct reviewers faster: either the language ("java", "python") or subsystem such as "build" or "doc" are good candidates.  
   
     - Fill out the template below to describe the changes contributed by the pull request. That will give reviewers the context they need to do the review.
     
     - Make sure that the change passes the automated tests. You can [build the entire project](https://github.com/apache/avro/blob/master/BUILD.md) or just the [language-specific SDK](https://avro.apache.org/project/how-to-contribute/#unit-tests).
   
     - Each pull request should address only one issue, not mix up code from multiple issues.
     
     - Each commit in the pull request has a meaningful commit message (including the JIRA id)
   
     - Every commit message references Jira issues in their subject lines. In addition, commits follow the guidelines from [How to write a good git commit message](https://chris.beams.io/posts/git-commit/)
       1. Subject is separated from body by a blank line
       1. Subject is limited to 50 characters (not including Jira issue reference)
       1. Subject does not end with a period
       1. Subject uses the imperative mood ("add", not "adding")
       1. Body wraps at 72 characters
       1. Body explains "what" and "why", not "how"
   
   -->
   
   ## What is the purpose of the change
   
   *(For example: This pull request improves file read performance by buffering data, fixing AVRO-XXXX.)*
   
   
   ## Verifying this change
   
   *(Please pick one of the following options)*
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This change is already covered by existing tests, such as *(please describe tests)*.
   
   *(or)*
   
   This change added tests and can be verified as follows:
   
   *(example:)*
   - *Extended interop tests to verify consistent valid schema names between SDKs*
   - *Added test that validates that Java throws an AvroRuntimeException on invalid binary data*
   - *Manually verified the change by building the website and checking the new redirect*
   
   
   ## Documentation
   
   - Does this pull request introduce a new feature? (yes / no)
   - If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] martin-g commented on pull request #2014: AVRO-3683: [Rust] Read/Write with multiple schemas

Posted by GitBox <gi...@apache.org>.
martin-g commented on PR #2014:
URL: https://github.com/apache/avro/pull/2014#issuecomment-1354586186

   @markfarnan I've re-worked it. Now the new roundtrip test passes.
   Please give it a try!
   I need to update the Reader/Writer APIs to support this too.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] martin-g commented on pull request #2014: AVRO-3683: [Rust] Read/Write with multiple schemas

Posted by "martin-g (via GitHub)" <gi...@apache.org>.
martin-g commented on PR #2014:
URL: https://github.com/apache/avro/pull/2014#issuecomment-1463430656

   There is a discussion about releasing Avro 1.11.1 in the dev@ mailing list.
   I still have no confirmation that the proposed changes in this PR work for the requesters ...
   
   Also CC @woile @WaterKnight1998


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] martin-g commented on pull request #2014: AVRO-3683: [Rust] Read/Write with multiple schemas

Posted by GitBox <gi...@apache.org>.
martin-g commented on PR #2014:
URL: https://github.com/apache/avro/pull/2014#issuecomment-1356859788

   > If I understand it correctly, this implementation means that for a complex message (I.e with deep schemas recursion) the user of the API has to build a vector with all the schemas and track the root one. Right ?
   
   This PR is about the use case when there are more than one schemata and they refer to each other. 
   When reading/writing the user has to provide the root/main schema as first parameter and all schemata as second parameter (to be able to resolve any references in the root/main one and all others).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] untereiner commented on pull request #2014: AVRO-3683: [Rust] Read/Write with multiple schemas

Posted by GitBox <gi...@apache.org>.
untereiner commented on PR #2014:
URL: https://github.com/apache/avro/pull/2014#issuecomment-1356808182

   Hi! I take the conversation on the fly and I am wondering about a few things. If I understand it correctly, this implementation means that for a complex message (I.e with deep schemas recursion) the user of the API has to build a vector with all the schemas and track the root one. Right ?
   What If the user gives a canonical schema ? I mean that all external record part of a root schema are replaced by the schemas definition ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] woile commented on a diff in pull request #2014: AVRO-3683: [Rust] Read/Write with multiple schemas

Posted by "woile (via GitHub)" <gi...@apache.org>.
woile commented on code in PR #2014:
URL: https://github.com/apache/avro/pull/2014#discussion_r1132135124


##########
lang/rust/avro/src/reader.rs:
##########
@@ -178,7 +180,13 @@ impl<R: Read> Block<R> {
 
         let mut block_bytes = &self.buf[self.buf_idx..];
         let b_original = block_bytes.len();
-        let item = from_avro_datum(&self.writer_schema, &mut block_bytes, read_schema)?;
+        let schemata = if self.schemata.is_empty() {

Review Comment:
   Oh I see! I think if it breaks I'll update my code. I've just started reading this PR, I'll add a comment as soon as I'm done :+1:



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] markfarnan commented on pull request #2014: AVRO-3683: [Rust] Read/Write with multiple schemas

Posted by "markfarnan (via GitHub)" <gi...@apache.org>.
markfarnan commented on PR #2014:
URL: https://github.com/apache/avro/pull/2014#issuecomment-1464904012

   @martin-g  I confirm this works for me.  Thanks !.  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] martin-g commented on pull request #2014: AVRO-3683: [Rust] Read/Write with multiple schemas

Posted by GitBox <gi...@apache.org>.
martin-g commented on PR #2014:
URL: https://github.com/apache/avro/pull/2014#issuecomment-1354357859

   The only way I see to support this is to have a `main` schema, e.g.:
   ```rust
    let actual = to_avro_datum_schemata(&main_schema, &schemata.as_slice(), record_value).unwrap();
   ```
   
   where `main_schema` is the schema that should be used for read/write and is a member of `schemata`.
   I.e. `#parse_list()` returns a `Vec<Schema>`, the user picks the main one from them and passes **all** as a second argument for schema resolution.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] martin-g commented on pull request #2014: AVRO-3683: [Rust] Read/Write with multiple schemas

Posted by GitBox <gi...@apache.org>.
martin-g commented on PR #2014:
URL: https://github.com/apache/avro/pull/2014#issuecomment-1398186238

   @markfarnan I see you thumped up my comment above. Could you please explicitly comment whether you have tested the changes ? Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] martin-g commented on pull request #2014: AVRO-3683: [Rust] Read/Write with multiple schemas

Posted by "martin-g (via GitHub)" <gi...@apache.org>.
martin-g commented on PR #2014:
URL: https://github.com/apache/avro/pull/2014#issuecomment-1430832636

   @chupaty Do you say that you have reviewed and tested the PR with your application(s) ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] martin-g commented on pull request #2014: AVRO-3683: [Rust] Read/Write with multiple schemas

Posted by GitBox <gi...@apache.org>.
martin-g commented on PR #2014:
URL: https://github.com/apache/avro/pull/2014#issuecomment-1356859936

   > What If the user gives an expanded schema ? I mean a schema where all named types are expanded to their real schema ?
   
   Then the user could just use the current APIs with just one argument with the root/main schema.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] markfarnan commented on pull request #2014: AVRO-3683: [Rust] Read/Write with multiple schemas

Posted by GitBox <gi...@apache.org>.
markfarnan commented on PR #2014:
URL: https://github.com/apache/avro/pull/2014#issuecomment-1355002612

   Awesome, Will do !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] woile commented on a diff in pull request #2014: AVRO-3683: [Rust] Read/Write with multiple schemas

Posted by "woile (via GitHub)" <gi...@apache.org>.
woile commented on code in PR #2014:
URL: https://github.com/apache/avro/pull/2014#discussion_r1132119304


##########
lang/rust/avro/src/reader.rs:
##########
@@ -178,7 +180,13 @@ impl<R: Read> Block<R> {
 
         let mut block_bytes = &self.buf[self.buf_idx..];
         let b_original = block_bytes.len();
-        let item = from_avro_datum(&self.writer_schema, &mut block_bytes, read_schema)?;
+        let schemata = if self.schemata.is_empty() {

Review Comment:
   personal: I find a `match` more idiomatic for this:
   
   ```rs
   let schemata = match self.schemata {
       None => vec![&self.writer_schema
       _ => self.schemata.clone()
   }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] markfarnan commented on pull request #2014: Avro 3683 multiple schemas

Posted by GitBox <gi...@apache.org>.
markfarnan commented on PR #2014:
URL: https://github.com/apache/avro/pull/2014#issuecomment-1353810724

   Thanks, 
   
   It's also panicing with my test case that uses slightly more complicated structs (the ones in the) PR and a struct rather than manually constructed value.  
   
   I'll see if I can work out why/if its me, or I'll update the PR in the morning with test case that uses your new functions.
   
   Here is the snippet that panics inside "to_avro_datum_schemata" in case usefull. 
   
   ```
   let record = MultiSchemaTestTypeA {
           b: Some(MultiSchemaTestTypeB {
               d: String::from("tom"),
               e: 451,
           }),
           c: default_multischematesttypea_c(),
       };
   
       let schemata: Vec<Schema> = Schema::parse_list(&[schema_TypeA, schema_TypeB]).unwrap();
       let schemata: Vec<&Schema> = schemata.iter().collect();
   
       let record_value = to_value(&record).unwrap();
       let actual = to_avro_datum_schemata(&schemata.as_slice(), record_value).unwrap();
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] chupaty commented on pull request #2014: AVRO-3683: [Rust] Read/Write with multiple schemas

Posted by "chupaty (via GitHub)" <gi...@apache.org>.
chupaty commented on PR #2014:
URL: https://github.com/apache/avro/pull/2014#issuecomment-1433938972

   I'd love to be able to commit more time to this.  But my very brief comments are:
   
   I initially tried running it against my dataset (~200 schemas), but ran into problems with ambiguous schema defs (note that my previous workflow of using to_avro_datum(...) still works).
   
   Tried to reproduce the above with a minimal dataset (hacked test_avro_3683_schemata_writer_reader), but ran into problems when I changed the order of schemas loaded by Schema::parse_list, ie (switch schema 'a' and schema 'b'):
   
   let schemata: Vec<Schema> = Schema::parse_list(&[SCHEMA_B_STR, SCHEMA_A_STR]).unwrap();
   
   I do have concerns about the structure of the schemata in my use case (ie lots of schemas).  It seems like a fairly big value of N that the schemata O(N) search uses, plus potentially some big-ish Vecs being passed around.
   
   I probably can't investigate much more in the short term, but will update when I can.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] martin-g commented on pull request #2014: AVRO-3683: [Rust] Read/Write with multiple schemas

Posted by "martin-g (via GitHub)" <gi...@apache.org>.
martin-g commented on PR #2014:
URL: https://github.com/apache/avro/pull/2014#issuecomment-1408703678

   @markfarnan Ping!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] martin-g commented on a diff in pull request #2014: AVRO-3683: [Rust] Read/Write with multiple schemas

Posted by "martin-g (via GitHub)" <gi...@apache.org>.
martin-g commented on code in PR #2014:
URL: https://github.com/apache/avro/pull/2014#discussion_r1132130417


##########
lang/rust/avro/src/reader.rs:
##########
@@ -178,7 +180,13 @@ impl<R: Read> Block<R> {
 
         let mut block_bytes = &self.buf[self.buf_idx..];
         let b_original = block_bytes.len();
-        let item = from_avro_datum(&self.writer_schema, &mut block_bytes, read_schema)?;
+        let schemata = if self.schemata.is_empty() {

Review Comment:
   Thanks for the review, @woile ! I am more interested whether the API changes help or break somehow your work on https://github.com/woile/avdl-rs



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] markfarnan commented on pull request #2014: AVRO-3683: [Rust] Read/Write with multiple schemas

Posted by GitBox <gi...@apache.org>.
markfarnan commented on PR #2014:
URL: https://github.com/apache/avro/pull/2014#issuecomment-1355734340

   On the flip side, I tested with one slightly more complex schema, passed in the right order, and it seems to round trip correctly !
   As that seems to work,  I'll do some more complex tests and let you know.  
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] martin-g commented on pull request #2014: AVRO-3683: [Rust] Read/Write with multiple schemas

Posted by GitBox <gi...@apache.org>.
martin-g commented on PR #2014:
URL: https://github.com/apache/avro/pull/2014#issuecomment-1358389385

   > performance difference between the two methods
   
   I am pretty sure the new (multiple schemata) methods will be slower than the old (single schema)!
   How much ? - Only benchmarks can tell you!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] markfarnan commented on pull request #2014: AVRO-3683: [Rust] Read/Write with multiple schemas

Posted by "markfarnan (via GitHub)" <gi...@apache.org>.
markfarnan commented on PR #2014:
URL: https://github.com/apache/avro/pull/2014#issuecomment-1463738884

   Checking this over the weekend.   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] martin-g commented on a diff in pull request #2014: AVRO-3683: [Rust] Read/Write with multiple schemas

Posted by "martin-g (via GitHub)" <gi...@apache.org>.
martin-g commented on code in PR #2014:
URL: https://github.com/apache/avro/pull/2014#discussion_r1132128895


##########
lang/rust/avro/src/reader.rs:
##########
@@ -178,7 +180,13 @@ impl<R: Read> Block<R> {
 
         let mut block_bytes = &self.buf[self.buf_idx..];
         let b_original = block_bytes.len();
-        let item = from_avro_datum(&self.writer_schema, &mut block_bytes, read_schema)?;
+        let schemata = if self.schemata.is_empty() {

Review Comment:
   `self.schemata` is a `Vec<&schema>` - https://github.com/apache/avro/pull/2014/files#diff-0b204c2ac80059e0a721c5626b26ee3037f0ef202d61187f20c0ad714d7d6607R49



##########
lang/rust/avro/src/reader.rs:
##########
@@ -178,7 +180,13 @@ impl<R: Read> Block<R> {
 
         let mut block_bytes = &self.buf[self.buf_idx..];
         let b_original = block_bytes.len();
-        let item = from_avro_datum(&self.writer_schema, &mut block_bytes, read_schema)?;
+        let schemata = if self.schemata.is_empty() {

Review Comment:
   `self.schemata` is a `Vec<&Schema>` - https://github.com/apache/avro/pull/2014/files#diff-0b204c2ac80059e0a721c5626b26ee3037f0ef202d61187f20c0ad714d7d6607R49



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] woile commented on pull request #2014: AVRO-3683: [Rust] Read/Write with multiple schemas

Posted by "woile (via GitHub)" <gi...@apache.org>.
woile commented on PR #2014:
URL: https://github.com/apache/avro/pull/2014#issuecomment-1463517816

   LGTM, this doesn't really affect me, as the [avdl parser](https://github.com/woile/avdl-rs) needs to solve all the references before even dealing with values. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] untereiner commented on pull request #2014: AVRO-3683: [Rust] Read/Write with multiple schemas

Posted by GitBox <gi...@apache.org>.
untereiner commented on PR #2014:
URL: https://github.com/apache/avro/pull/2014#issuecomment-1357888535

   > Then the user could just use the current APIs with just one argument with the root/main schema.
   
   ha ok! Do you think there is a performance difference between the two methods ? Considering the same schema, one time completely expanded and another time with this API


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] markfarnan commented on pull request #2014: AVRO-3683: [Rust] Read/Write with multiple schemas

Posted by GitBox <gi...@apache.org>.
markfarnan commented on PR #2014:
URL: https://github.com/apache/avro/pull/2014#issuecomment-1355729169

   Found a problem. 
   
   The order the Schema's given to parse_list seems to matter.    If they are passed in descending order of reference, everything works. 
   
   If they are passed out of order for references to resolve forward,  then avro_to_datum_schemata panics.  (Parse List seems to manage fine with any order)
   
   i.e. if you modify your test thus:, it will panic. 
   
   `  let schemata: Vec<Schema> = Schema::parse_list(&[SCHEMA_B_STR, SCHEMA_A_STR]).unwrap();
       let schemata: Vec<&Schema> = schemata.iter().collect();
   
       // this is the Schema we want to use for write/read
       let schema_b = schemata[0];`
   
   This will be a problem with large schema's.   For my use case some Records have references that use up to 20+ schema's,  guarenteeing they are provided in the right order would be a nightmare.  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] markfarnan commented on pull request #2014: AVRO-3683: [Rust] Read/Write with multiple schemas

Posted by GitBox <gi...@apache.org>.
markfarnan commented on PR #2014:
URL: https://github.com/apache/avro/pull/2014#issuecomment-1354432025

   > 
   
   That fits with how I was thinking it would need to be done, both for read and write.  
   
   The only alternative I can think of,  would be to pass the Name of the record to be used, and let the resolver find it in the Schemata. 
   
   Either way, ideally there is an easy way to find the relevant schema in the Schemata slice. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] chupaty commented on pull request #2014: AVRO-3683: [Rust] Read/Write with multiple schemas

Posted by "chupaty (via GitHub)" <gi...@apache.org>.
chupaty commented on PR #2014:
URL: https://github.com/apache/avro/pull/2014#issuecomment-1430539910

   Just wanted to add my support for this PR from the sidelines - I hope it will unblock the use of rust in my org - we have hundreds of interdependent schemas and ran across this problem during evaluation.  
   
   Thanks for all your good work!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] martin-g commented on pull request #2014: AVRO-3683: [Rust] Read/Write with multiple schemas

Posted by "martin-g (via GitHub)" <gi...@apache.org>.
martin-g commented on PR #2014:
URL: https://github.com/apache/avro/pull/2014#issuecomment-1465621864

   Thank you, @markfarnan !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] untereiner commented on pull request #2014: AVRO-3683: [Rust] Read/Write with multiple schemas

Posted by GitBox <gi...@apache.org>.
untereiner commented on PR #2014:
URL: https://github.com/apache/avro/pull/2014#issuecomment-1358998092

   thanks @martin-g ! I will give it a try.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] markfarnan commented on pull request #2014: AVRO-3683: [Rust] Read/Write with multiple schemas

Posted by GitBox <gi...@apache.org>.
markfarnan commented on PR #2014:
URL: https://github.com/apache/avro/pull/2014#issuecomment-1356867261

   Update: I've been testing this PR for our protocol schema's, and so far it works fine.   
   
   - I went through and ordered the schemas by dependancy, so it no longer panics.   I think being able to handle schema-resolution out of order would be a nice-to-have in the "ResolvedSchema::try_from(schemata)?;"  section
   
   - All 208 schema's are now parsing and resolving correctly in a single pass.  This covers all messages of the protocl.
   
   - For finding the 'root' schema's, I'm post procssing the output of  Schema::parse_list to create a qualifiedname/schema map to make it easy to aquire  a 'root' schema when sending messages. Works so far. 
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] martin-g commented on pull request #2014: Avro 3683 multiple schemas

Posted by GitBox <gi...@apache.org>.
martin-g commented on PR #2014:
URL: https://github.com/apache/avro/pull/2014#issuecomment-1353146476

   @markfarnan The impl is almost ready but as you can see the test in `lang/rust/avro/tests/to_from_avro_datum_schemata.rs` fails because both schemata validate against the read bytes.
   This is a blocker :-/


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] markfarnan commented on pull request #2014: AVRO-3683: [Rust] Read/Write with multiple schemas

Posted by "markfarnan (via GitHub)" <gi...@apache.org>.
markfarnan commented on PR #2014:
URL: https://github.com/apache/avro/pull/2014#issuecomment-1411739287

   Checking this week.    I'm still somewhat blocked by the missing upstream PR's,  though I've got a temp workaround for now. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] woile commented on a diff in pull request #2014: AVRO-3683: [Rust] Read/Write with multiple schemas

Posted by "woile (via GitHub)" <gi...@apache.org>.
woile commented on code in PR #2014:
URL: https://github.com/apache/avro/pull/2014#discussion_r1132119304


##########
lang/rust/avro/src/reader.rs:
##########
@@ -178,7 +180,13 @@ impl<R: Read> Block<R> {
 
         let mut block_bytes = &self.buf[self.buf_idx..];
         let b_original = block_bytes.len();
-        let item = from_avro_datum(&self.writer_schema, &mut block_bytes, read_schema)?;
+        let schemata = if self.schemata.is_empty() {

Review Comment:
   personal: I find a `match` more idiomatic for this:
   
   ```rs
   let schemata = match self.schemata {
       None => vec![&self.writer_schema
       _ => self.schemata.clone()
   }```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] martin-g merged pull request #2014: AVRO-3683: [Rust] Read/Write with multiple schemas

Posted by "martin-g (via GitHub)" <gi...@apache.org>.
martin-g merged PR #2014:
URL: https://github.com/apache/avro/pull/2014


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] martin-g commented on pull request #2014: AVRO-3683: [Rust] Read/Write with multiple schemas

Posted by GitBox <gi...@apache.org>.
martin-g commented on PR #2014:
URL: https://github.com/apache/avro/pull/2014#issuecomment-1376370444

   @markfarnan I think I am ready with this PR. Please review it and test it before I merge!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org