You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "sarutak (via GitHub)" <gi...@apache.org> on 2023/08/09 18:16:38 UTC

[GitHub] [avro] sarutak opened a new pull request, #2433: AVRO-3827: [Rust] Disallow duplicate field names

sarutak opened a new pull request, #2433:
URL: https://github.com/apache/avro/pull/2433

   AVRO-3827
   
   ## What is the purpose of the change
   
   This PR aims to fix an issue that the current Rust binding accepts duplicate field names.
   If a schema contains a record and some of its fields have the same field name, such schema should not be allowed.
   ```
   {
     "name": "my_schema",
     "type": "record",
     "fields": [
       {
         "name": "f1",
         "type": {
           "name": "a",
           "type": "record",
           "fields": []
         }
       },  {
         "name": "f1",
         "type": {
           "name": "b",
           "type": "record",
           "fields": []
         }
       }
     ]
    }
   ```
   
   ## Verifying this change
   Added new test and passed with `cargo test avro_3827`.
   
   This change added tests and can be verified as follows:
   
   *(example:)*
   - *Extended interop tests to verify consistent valid schema names between SDKs*
   - *Added test that validates that Java throws an AvroRuntimeException on invalid binary data*
   - *Manually verified the change by building the website and checking the new redirect*
   
   ## Documentation
   No new features added.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] martin-g commented on a diff in pull request #2433: AVRO-3827: [Rust] Disallow duplicate field names

Posted by "martin-g (via GitHub)" <gi...@apache.org>.
martin-g commented on code in PR #2433:
URL: https://github.com/apache/avro/pull/2433#discussion_r1289625290


##########
lang/rust/avro/src/schema.rs:
##########
@@ -1440,7 +1440,12 @@ impl Parser {
                     .collect::<Result<_, _>>()
             })?;
 
+        let mut existing_fields: HashSet<&String> = HashSet::with_capacity(fields.len());
         for field in &fields {
+            if existing_fields.contains(&field.name) {
+                return Err(Error::FieldNameDuplicate(field.name.clone()));
+            }
+            existing_fields.insert(&field.name);

Review Comment:
   Do we need the new data structure ?
   We can use the returned value of https://doc.rust-lang.org/std/collections/struct.BTreeMap.html#method.insert to decide whether to return an error or not.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] sarutak commented on pull request #2433: AVRO-3827: [Rust] Disallow duplicate field names

Posted by "sarutak (via GitHub)" <gi...@apache.org>.
sarutak commented on PR #2433:
URL: https://github.com/apache/avro/pull/2433#issuecomment-1672705641

   @martin-g Thank you for the review!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] sarutak commented on a diff in pull request #2433: AVRO-3827: [Rust] Disallow duplicate field names

Posted by "sarutak (via GitHub)" <gi...@apache.org>.
sarutak commented on code in PR #2433:
URL: https://github.com/apache/avro/pull/2433#discussion_r1289648096


##########
lang/rust/avro/src/schema.rs:
##########
@@ -1441,6 +1441,9 @@ impl Parser {
             })?;
 
         for field in &fields {
+            if lookup.contains_key(&field.name) {
+                return Err(Error::FieldNameDuplicate(field.name.clone()));
+            }
             lookup.insert(field.name.clone(), field.position);

Review Comment:
   I'll manually fix.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] martin-g commented on a diff in pull request #2433: AVRO-3827: [Rust] Disallow duplicate field names

Posted by "martin-g (via GitHub)" <gi...@apache.org>.
martin-g commented on code in PR #2433:
URL: https://github.com/apache/avro/pull/2433#discussion_r1289636579


##########
lang/rust/avro/src/schema.rs:
##########
@@ -1441,6 +1441,9 @@ impl Parser {
             })?;
 
         for field in &fields {
+            if lookup.contains_key(&field.name) {
+                return Err(Error::FieldNameDuplicate(field.name.clone()));
+            }
             lookup.insert(field.name.clone(), field.position);

Review Comment:
   I still think using the returned `Option` from `#insert()` is better than doing two lookups.
   ```suggestion
               if let Some(_old) = lookup.insert(field.name.clone(), field.position) {
                         return Err(Error::FieldNameDuplicate(field.name.clone()));
               }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] martin-g commented on pull request #2433: AVRO-3827: [Rust] Disallow duplicate field names

Posted by "martin-g (via GitHub)" <gi...@apache.org>.
martin-g commented on PR #2433:
URL: https://github.com/apache/avro/pull/2433#issuecomment-1672704017

   Thank you, @sarutak !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] sarutak commented on a diff in pull request #2433: AVRO-3827: [Rust] Disallow duplicate field names

Posted by "sarutak (via GitHub)" <gi...@apache.org>.
sarutak commented on code in PR #2433:
URL: https://github.com/apache/avro/pull/2433#discussion_r1289632189


##########
lang/rust/avro/src/schema.rs:
##########
@@ -1440,7 +1440,12 @@ impl Parser {
                     .collect::<Result<_, _>>()
             })?;
 
+        let mut existing_fields: HashSet<&String> = HashSet::with_capacity(fields.len());
         for field in &fields {
+            if existing_fields.contains(&field.name) {
+                return Err(Error::FieldNameDuplicate(field.name.clone()));
+            }
+            existing_fields.insert(&field.name);

Review Comment:
   Yes, we can use `lookup`. I've fixed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] sarutak commented on a diff in pull request #2433: AVRO-3827: [Rust] Disallow duplicate field names

Posted by "sarutak (via GitHub)" <gi...@apache.org>.
sarutak commented on code in PR #2433:
URL: https://github.com/apache/avro/pull/2433#discussion_r1289637659


##########
lang/rust/avro/src/schema.rs:
##########
@@ -1441,6 +1441,9 @@ impl Parser {
             })?;
 
         for field in &fields {
+            if lookup.contains_key(&field.name) {
+                return Err(Error::FieldNameDuplicate(field.name.clone()));
+            }
             lookup.insert(field.name.clone(), field.position);

Review Comment:
   Ah, exactly.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] martin-g merged pull request #2433: AVRO-3827: [Rust] Disallow duplicate field names

Posted by "martin-g (via GitHub)" <gi...@apache.org>.
martin-g merged PR #2433:
URL: https://github.com/apache/avro/pull/2433


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org