You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by GitBox <gi...@apache.org> on 2022/08/12 08:43:34 UTC

[GitHub] [avro] martin-g opened a new pull request, #1826: AVRO-3601: CustomAttributes#getAttribute() now returns boost::optional

martin-g opened a new pull request, #1826:
URL: https://github.com/apache/avro/pull/1826

   Add unit tests for CustomAttributes#getAttribute(string)
   
   ### Jira
   
   - [X] https://issues.apache.org/jira/browse/AVRO-3601
   
   ### Tests
   
   - [X] My PR adds unit tests
   
   ### Commits
   
   - [X] My commits all reference Jira issues in their subject lines. In addition, my commits follow the guidelines from "[How to write a good git commit message](https://chris.beams.io/posts/git-commit/)":
     1. Subject is separated from body by a blank line
     1. Subject is limited to 50 characters (not including Jira issue reference)
     1. Subject does not end with a period
     1. Subject uses the imperative mood ("add", not "adding")
     1. Body wraps at 72 characters
     1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes how to use it.
     - All the public functions and the classes in the PR contain Javadoc that explain what it does
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] martin-g commented on a diff in pull request #1826: AVRO-3601: CustomAttributes#getAttribute() now returns boost::optional

Posted by GitBox <gi...@apache.org>.
martin-g commented on code in PR #1826:
URL: https://github.com/apache/avro/pull/1826#discussion_r944376528


##########
lang/c++/test/unittest.cc:
##########
@@ -452,7 +457,14 @@ struct TestSchema {
                                             customAttributes);
         std::string expectedJsonWithCustomAttribute =
         "{\"type\": \"record\", \"name\": \"Test\",\"fields\": "
-        "[{\"name\": \"f1\", \"type\": \"long\",\"extra field\": \"1\"}]}";
+        "[{\"name\": \"f1\", \"type\": \"long\", "
+        "\"arrayField\": \"[1]\", "
+        "\"booleanField\": \"true\", "
+        "\"mapField\": \"{\\\"key1\\\":\\\"value1\\\", \\\"key2\\\":\\\"value2\\\"}\", "
+        "\"nullField\": \"null\", "
+        "\"numberField\": \"1.23\", "
+        "\"stringField\": \"\\\"field value with \\\"double quotes\\\"\\\"\""

Review Comment:
   At the moment the content is preserved as whatever the user provided. It could be JSON, XML, base64, ...
   It is up-to the user app to encode/decode the values.
   You might be right about the non-optional representation (`""`) but IMO this way it is more clear. Other opinions are also welcome!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] KalleOlaviNiemitalo commented on a diff in pull request #1826: AVRO-3601: CustomAttributes#getAttribute() now returns boost::optional

Posted by GitBox <gi...@apache.org>.
KalleOlaviNiemitalo commented on code in PR #1826:
URL: https://github.com/apache/avro/pull/1826#discussion_r944321962


##########
lang/c++/test/unittest.cc:
##########
@@ -452,7 +457,14 @@ struct TestSchema {
                                             customAttributes);
         std::string expectedJsonWithCustomAttribute =
         "{\"type\": \"record\", \"name\": \"Test\",\"fields\": "
-        "[{\"name\": \"f1\", \"type\": \"long\",\"extra field\": \"1\"}]}";
+        "[{\"name\": \"f1\", \"type\": \"long\", "
+        "\"arrayField\": \"[1]\", "
+        "\"booleanField\": \"true\", "
+        "\"mapField\": \"{\\\"key1\\\":\\\"value1\\\", \\\"key2\\\":\\\"value2\\\"}\", "
+        "\"nullField\": \"null\", "
+        "\"numberField\": \"1.23\", "
+        "\"stringField\": \"\\\"field value with \\\"double quotes\\\"\\\"\""

Review Comment:
   As in <https://github.com/apache/avro/pull/1821#discussion_r943586881>, I think this should be
   
   ```suggestion
           "[{\"name\": \"f1\", \"type\": \"long\", "
           "\"arrayField\": [1], "
           "\"booleanField\": true, "
           "\"mapField\": {\"key1\":\"value1\", \"key2\":\"value2\"}, "
           "\"nullField\": null, "
           "\"numberField\": 1.23, "
           "\"stringField\": \"field value with \\\"double quotes\\\"\""
   ```
   
   i.e. CustomAttributes.printJson should assume that the `std::string` values are already in JSON format, and write them out without adding any quotation marks around them or backslashes within them.  Likewise, callers of CustomAttributes::addAttribute (especially in Compiler.cc) should provide a JSON-format `std::string`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] KalleOlaviNiemitalo commented on a diff in pull request #1826: AVRO-3601: CustomAttributes#getAttribute() now returns boost::optional

Posted by GitBox <gi...@apache.org>.
KalleOlaviNiemitalo commented on code in PR #1826:
URL: https://github.com/apache/avro/pull/1826#discussion_r945464410


##########
lang/c++/test/unittest.cc:
##########
@@ -452,7 +457,14 @@ struct TestSchema {
                                             customAttributes);
         std::string expectedJsonWithCustomAttribute =
         "{\"type\": \"record\", \"name\": \"Test\",\"fields\": "
-        "[{\"name\": \"f1\", \"type\": \"long\",\"extra field\": \"1\"}]}";
+        "[{\"name\": \"f1\", \"type\": \"long\", "
+        "\"arrayField\": \"[1]\", "
+        "\"booleanField\": \"true\", "
+        "\"mapField\": \"{\\\"key1\\\":\\\"value1\\\", \\\"key2\\\":\\\"value2\\\"}\", "
+        "\"nullField\": \"null\", "
+        "\"numberField\": \"1.23\", "
+        "\"stringField\": \"\\\"field value with \\\"double quotes\\\"\\\"\""

Review Comment:
   I think, minimally, the library should be able to read a schema that contains custom attributes with arbitrary value types, but not necessarily able to preserve the values in memory and write them out again. That would help compatibility with future versions of Avro, e.g. new standard logical types.
   
   If CustomAttributes::attributes returns a reference to a map that contains the string values, then that makes it harder for a future version of the library to add support for other types without a breaking change.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] martin-g commented on a diff in pull request #1826: AVRO-3601: CustomAttributes#getAttribute() now returns boost::optional

Posted by GitBox <gi...@apache.org>.
martin-g commented on code in PR #1826:
URL: https://github.com/apache/avro/pull/1826#discussion_r945457612


##########
lang/c++/test/unittest.cc:
##########
@@ -452,7 +457,14 @@ struct TestSchema {
                                             customAttributes);
         std::string expectedJsonWithCustomAttribute =
         "{\"type\": \"record\", \"name\": \"Test\",\"fields\": "
-        "[{\"name\": \"f1\", \"type\": \"long\",\"extra field\": \"1\"}]}";
+        "[{\"name\": \"f1\", \"type\": \"long\", "
+        "\"arrayField\": \"[1]\", "
+        "\"booleanField\": \"true\", "
+        "\"mapField\": \"{\\\"key1\\\":\\\"value1\\\", \\\"key2\\\":\\\"value2\\\"}\", "
+        "\"nullField\": \"null\", "
+        "\"numberField\": \"1.23\", "
+        "\"stringField\": \"\\\"field value with \\\"double quotes\\\"\\\"\""

Review Comment:
   The Avro spec does not say anything about the custom attributes/metadata.
   
   Until AVRO-3547 the C++ SDK didn't support it at all. (The Rust SDK still does not support this too. I [expect](https://github.com/lerouxrgd/rsgen-avro/issues/32#issuecomment-1212123036) a user to open a ticket/PR this week).
   With AVRO-3601 we [found out](https://github.com/apache/avro/pull/1820#issuecomment-1211711217) that using JsonDom.hh for the custom attributes is not recommended, thus the string-based approach.
   
   I guess 1.11.2/1.12.0 will be released in several months, so whoever is interested in better handling of the custom attributes should step up and do it. Here I just tried to fix the broken installation of C++ SDK 1.11.1.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] KalleOlaviNiemitalo commented on a diff in pull request #1826: AVRO-3601: CustomAttributes#getAttribute() now returns boost::optional

Posted by GitBox <gi...@apache.org>.
KalleOlaviNiemitalo commented on code in PR #1826:
URL: https://github.com/apache/avro/pull/1826#discussion_r944332461


##########
lang/c++/test/unittest.cc:
##########
@@ -452,7 +457,14 @@ struct TestSchema {
                                             customAttributes);
         std::string expectedJsonWithCustomAttribute =
         "{\"type\": \"record\", \"name\": \"Test\",\"fields\": "
-        "[{\"name\": \"f1\", \"type\": \"long\",\"extra field\": \"1\"}]}";
+        "[{\"name\": \"f1\", \"type\": \"long\", "
+        "\"arrayField\": \"[1]\", "
+        "\"booleanField\": \"true\", "
+        "\"mapField\": \"{\\\"key1\\\":\\\"value1\\\", \\\"key2\\\":\\\"value2\\\"}\", "
+        "\"nullField\": \"null\", "
+        "\"numberField\": \"1.23\", "
+        "\"stringField\": \"\\\"field value with \\\"double quotes\\\"\\\"\""

Review Comment:
   Would the JSON representation of custom attributes be compatible with Avro IDL? The [IDL Language](https://avro.apache.org/docs/1.11.1/idl-language/) spec is not clear on whether the thing between parentheses in an annotation is always a JSON value.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] martin-g commented on a diff in pull request #1826: AVRO-3601: CustomAttributes#getAttribute() now returns boost::optional

Posted by GitBox <gi...@apache.org>.
martin-g commented on code in PR #1826:
URL: https://github.com/apache/avro/pull/1826#discussion_r945465497


##########
lang/c++/test/unittest.cc:
##########
@@ -452,7 +457,14 @@ struct TestSchema {
                                             customAttributes);
         std::string expectedJsonWithCustomAttribute =
         "{\"type\": \"record\", \"name\": \"Test\",\"fields\": "
-        "[{\"name\": \"f1\", \"type\": \"long\",\"extra field\": \"1\"}]}";
+        "[{\"name\": \"f1\", \"type\": \"long\", "
+        "\"arrayField\": \"[1]\", "
+        "\"booleanField\": \"true\", "
+        "\"mapField\": \"{\\\"key1\\\":\\\"value1\\\", \\\"key2\\\":\\\"value2\\\"}\", "
+        "\"nullField\": \"null\", "
+        "\"numberField\": \"1.23\", "
+        "\"stringField\": \"\\\"field value with \\\"double quotes\\\"\\\"\""

Review Comment:
   > I guess 1.11.2/1.12.0 will be released in several months, so whoever is interested in better handling of the custom attributes should step up and do it. Here I just tried to fix the broken installation of C++ SDK 1.11.1.
   
   Let me re-phrase the above: PRs are very welcome!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] martin-g merged pull request #1826: AVRO-3601: CustomAttributes#getAttribute() now returns boost::optional

Posted by GitBox <gi...@apache.org>.
martin-g merged PR #1826:
URL: https://github.com/apache/avro/pull/1826


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] KalleOlaviNiemitalo commented on a diff in pull request #1826: AVRO-3601: CustomAttributes#getAttribute() now returns boost::optional

Posted by GitBox <gi...@apache.org>.
KalleOlaviNiemitalo commented on code in PR #1826:
URL: https://github.com/apache/avro/pull/1826#discussion_r944507324


##########
lang/c++/test/unittest.cc:
##########
@@ -452,7 +457,14 @@ struct TestSchema {
                                             customAttributes);
         std::string expectedJsonWithCustomAttribute =
         "{\"type\": \"record\", \"name\": \"Test\",\"fields\": "
-        "[{\"name\": \"f1\", \"type\": \"long\",\"extra field\": \"1\"}]}";
+        "[{\"name\": \"f1\", \"type\": \"long\", "
+        "\"arrayField\": \"[1]\", "
+        "\"booleanField\": \"true\", "
+        "\"mapField\": \"{\\\"key1\\\":\\\"value1\\\", \\\"key2\\\":\\\"value2\\\"}\", "
+        "\"nullField\": \"null\", "
+        "\"numberField\": \"1.23\", "
+        "\"stringField\": \"\\\"field value with \\\"double quotes\\\"\\\"\""

Review Comment:
   In an Avro schema file, must all custom attributes of fields have string values?  I.e. is this invalid:
   
   ```JSON
   {
       "type": "record",
       "name": "Demo",
       "fields": [
           {
               "name": "field",
               "type": "string",
               "custom_flag": true
           }
       ]
   }
   ```
   
   If this schema is not invalid, then is the Avro C++ library able to load it from a file and then write it to another file, preserving the custom attribute?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] martin-g commented on a diff in pull request #1826: AVRO-3601: CustomAttributes#getAttribute() now returns boost::optional

Posted by GitBox <gi...@apache.org>.
martin-g commented on code in PR #1826:
URL: https://github.com/apache/avro/pull/1826#discussion_r944376528


##########
lang/c++/test/unittest.cc:
##########
@@ -452,7 +457,14 @@ struct TestSchema {
                                             customAttributes);
         std::string expectedJsonWithCustomAttribute =
         "{\"type\": \"record\", \"name\": \"Test\",\"fields\": "
-        "[{\"name\": \"f1\", \"type\": \"long\",\"extra field\": \"1\"}]}";
+        "[{\"name\": \"f1\", \"type\": \"long\", "
+        "\"arrayField\": \"[1]\", "
+        "\"booleanField\": \"true\", "
+        "\"mapField\": \"{\\\"key1\\\":\\\"value1\\\", \\\"key2\\\":\\\"value2\\\"}\", "
+        "\"nullField\": \"null\", "
+        "\"numberField\": \"1.23\", "
+        "\"stringField\": \"\\\"field value with \\\"double quotes\\\"\\\"\""

Review Comment:
   At the moment the content is preserved as whatever the user provided. It could be JSON, XML, base64, ...
   It is up-to the user app to encode/decode the values.
   You might be right about the non-optional representation (`""`) but IMO this way it is more clear. Other options are also welcome!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] martin-g commented on a diff in pull request #1826: AVRO-3601: CustomAttributes#getAttribute() now returns boost::optional

Posted by GitBox <gi...@apache.org>.
martin-g commented on code in PR #1826:
URL: https://github.com/apache/avro/pull/1826#discussion_r945457612


##########
lang/c++/test/unittest.cc:
##########
@@ -452,7 +457,14 @@ struct TestSchema {
                                             customAttributes);
         std::string expectedJsonWithCustomAttribute =
         "{\"type\": \"record\", \"name\": \"Test\",\"fields\": "
-        "[{\"name\": \"f1\", \"type\": \"long\",\"extra field\": \"1\"}]}";
+        "[{\"name\": \"f1\", \"type\": \"long\", "
+        "\"arrayField\": \"[1]\", "
+        "\"booleanField\": \"true\", "
+        "\"mapField\": \"{\\\"key1\\\":\\\"value1\\\", \\\"key2\\\":\\\"value2\\\"}\", "
+        "\"nullField\": \"null\", "
+        "\"numberField\": \"1.23\", "
+        "\"stringField\": \"\\\"field value with \\\"double quotes\\\"\\\"\""

Review Comment:
   The Avro spec does not say anything about the possible value types of the custom attributes/metadata.
   
   Until AVRO-3547 the C++ SDK didn't support it at all. (The Rust SDK still does not support this too. I [expect](https://github.com/lerouxrgd/rsgen-avro/issues/32#issuecomment-1212123036) a user to open a ticket/PR this week).
   With AVRO-3601 we [found out](https://github.com/apache/avro/pull/1820#issuecomment-1211711217) that using JsonDom.hh for the custom attributes is not recommended, thus the string-based approach.
   
   I guess 1.11.2/1.12.0 will be released in several months, so whoever is interested in better handling of the custom attributes should step up and do it. Here I just tried to fix the broken installation of C++ SDK 1.11.1.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [avro] KalleOlaviNiemitalo commented on a diff in pull request #1826: AVRO-3601: CustomAttributes#getAttribute() now returns boost::optional

Posted by GitBox <gi...@apache.org>.
KalleOlaviNiemitalo commented on code in PR #1826:
URL: https://github.com/apache/avro/pull/1826#discussion_r944324591


##########
lang/c++/test/unittest.cc:
##########
@@ -452,7 +457,14 @@ struct TestSchema {
                                             customAttributes);
         std::string expectedJsonWithCustomAttribute =
         "{\"type\": \"record\", \"name\": \"Test\",\"fields\": "
-        "[{\"name\": \"f1\", \"type\": \"long\",\"extra field\": \"1\"}]}";
+        "[{\"name\": \"f1\", \"type\": \"long\", "
+        "\"arrayField\": \"[1]\", "
+        "\"booleanField\": \"true\", "
+        "\"mapField\": \"{\\\"key1\\\":\\\"value1\\\", \\\"key2\\\":\\\"value2\\\"}\", "
+        "\"nullField\": \"null\", "
+        "\"numberField\": \"1.23\", "
+        "\"stringField\": \"\\\"field value with \\\"double quotes\\\"\\\"\""

Review Comment:
   If CustomAttributes worked that way, then it would be able to use just `std::string` rather than `boost::optional<std::string>`, because an empty `std::string` could mean that the attribute is not present, while an `std::string` containing two quotation marks `""` would mean that the value is an empty JSON string literal.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@avro.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org