You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by Patrick Farry <pa...@gmail.com> on 2019/08/06 03:55:46 UTC

Re: C# POCO serializer/deserializer

Hi Brian,

I got the schema from C# source code generator working, at least well enough for us. Not sure it is ready for general use but if you’re interested I can send the code. What is the usual practice for things under development - should I create a PR even if the code is some way off from being mergeable or just a branch of my repo?

Our use case is that the developers write C# DTO classes and don’t need to worry much about the schema. We build a protocol from the classes and then schemas for each top level object from the protocol. 

Also, I think we (Pitney Bowes) need to get something going about a JSON encoder. Shifting to Avro is a bit of a stretch for some of our developers since they can’t easily see the message contents without additional decoding tools. Are you ok with basing it on JSON.net? It should be pretty fast to get something working.

> On Jul 11, 2019, at 5:05 PM, Brian Lachniet <bl...@gmail.com> wrote:
> 
> Hi Patrick!
> 
> I'm sorry I haven't gotten back to the Reflect review yet. I'm hoping to
> get back on that this weekend.
> 
> That sounds like a good list of additions. I'm particularly interested in
> having support for JSON encoding/decoding in C#. I bet we could get a long
> ways simply porting the JsonEncoder
> <https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/io/JsonEncoder.java>
> from the Java bindings.
> 
> I agree that we need to bump the Newtonsoft.Json verison. That just came up
> in another issue recently. What version of Newtonsoft.Json do you think we
> should upgrade to? I'm a little wary of bumping all the way to the latest,
> as that would force everyone using Avro to also go to the latest version.
> Maybe that's not a bad thing, I'm not sure.
> 
> I'm not familiar with Roslyn-based code generation. I'll do some Googling
> on that (if you've got some good articles or starting points to share, I'd
> appreciate that). Codegen currently uses the System.CodeDom API. Would
> Roslyn-based code generation be a replacement for that, or am I
> misunderstanding something?
> 
> Thank you for your patience so far, Patrick. I'm looking forward to these
> new features too!
> 
> On Thu, Jul 11, 2019 at 6:14 PM Patrick Farry <pa...@gmail.com>
> wrote:
> 
>> Hi Brian,
>> 
>> Not meaning to hassle you (well maybe a little). I’ve got a few other
>> enhancements we'd like to do as well as the Reflect code, and was hoping
>> not to have to maintain a bunch of branches.
>> 
>> Here are the planned changes:
>> - update newtonsoft.json and add the json path whenever the schema parser
>> reports an error - we have a number of large schemas and debugging them is
>> proving difficult
>> - Roslyn based schema generator from C# class
>> - Reflect option added to codegen (possibly updating codegen to use Roslyn)
>> - Serialize to json
>> 
>> Thanks.
>> 
>>> On May 2, 2019, at 6:22 PM, Brian Lachniet <bl...@gmail.com> wrote:
>>> 
>>> Hey Patrick,
>>> 
>>> This sounds very useful! I'd love to see this introduced to the C#
>> library.
>>> 
>>> On Thu, May 2, 2019 at 5:47 PM Ivan Greene <ig...@fanthreesixty.com>
>>> wrote:
>>> 
>>>> Patrick,
>>>> 
>>>> This sounds as though it would be the C# equivalent of Java's
>> ReflectData,
>>>> which has been part of the Avro API for several years now, so it’s
>> likely
>>>> there would be interest. Your best bet is to open a Jira describing the
>>>> planned feature and open a pull request on Github to start the
>> discussion.
>>>> The contributing page on the wiki is a bit out of date and still
>> recommends
>>>> submitting patches on Jira but I believe that Github is now the official
>>>> repository.
>>>> 
>>>>> On May 1, 2019, at 4:31 PM, Patrick Farry <pa...@gmail.com>
>>>> wrote:
>>>>> 
>>>>> Hi All,
>>>>> 
>>>>> I have written code that implements Avro serialization for POCO classes
>>>> - i.e. classes that are not generated by Avro codegen and do not
>> implement
>>>> ISpecificRecord. The idea was to make it work as much like JSON.net as
>>>> possible.
>>>>> 
>>>>> The serializer inherits from SpecificDefaultWriter and the deserializer
>>>> from SpecificDefaultReader.
>>>>> 
>>>>> Avro fields are mapped to C# properties either by matching the field
>>>> name and property name or by using an attribute to specify the field
>>>> sequence number.
>>>>> 
>>>>> Is this something that would be of interest to the Avro project? My
>>>> company has approved committing the code and I’d be available and happy
>> to
>>>> maintain this code and work on other parts of the C# implementation.
>>>>> 
>>>>> Regards.
>>>>> 
>>>>> 
>>>> 
>>>> 
>>> 
>>> --
>>> 
>>> [image: 51b630b05e01a6d5134ccfd520f547c4.png]
>>> 
>>> Brian Lachniet
>>> 
>>> Software Engineer
>>> 
>>> E: blachniet@gmail.com | blachniet.com <http://www.blachniet.com>
>>> 
>>> <https://twitter.com/blachniet> <http://www.linkedin.com/in/blachniet>
>> 
>> 
> 
> -- 
> 
> [image: 51b630b05e01a6d5134ccfd520f547c4.png]
> 
> Brian Lachniet
> 
> Software Engineer
> 
> E: blachniet@gmail.com | blachniet.com <http://www.blachniet.com>
> 
> <https://twitter.com/blachniet> <http://www.linkedin.com/in/blachniet>


Re: C# POCO serializer/deserializer

Posted by Patrick Farry <pa...@gmail.com>.
The schema generator is checked in as a separate project for now. It takes a number of C# classes and generates an Avro protocol using the Microsoft.CodeAnalysis package.

https://github.com/pa009fa/schemagen <https://github.com/pa009fa/schemagen>

There is a driver program which runs with the following parameters:

$ dotnet run -- --help
schemagen 1.0.0
Copyright (C) 2019 schemagen

  -i, --include              Required. Includes

  -o, --outfile              (Default: .) Output file. . for console

  -t, --target               Required. File name of the type to generate

  -d, --defaultconverters    Required. Default converters

  -c, --converters           Required. Default converters

  --help                     Display this help screen.

  --version                  Display version information.

The VSCode debugger is set up to generate a protocol from the classes in the csharp folder.

There are a number of new attributes to help when specifying the schema. Examples are in the Z.cs file. Attribute specify unions, default values etc.

The Reflect array helpers are not supported yet.

Any and all feedback is welcome. 

At this stage I’m not sure it’s ready fro a detailed code review (but that would be fine as well if anyone has the time).

Also, any thoughts on how (whether?) to integrate it into the project are also welcome.

Cheers.



> On Aug 10, 2019, at 10:36 AM, Brian Lachniet <bl...@gmail.com> wrote:
> 
> Hey Patrick,
> 
> That sounds pretty cool. I think that the best way to handle this right
> now, if you believe there is additional work to make it ready for more
> general use, is to keep it on a separate branch in your repository.
> 
> I wonder if this sort of functionality exists in other language bindings
> (I'm really only familiar with the C# bindings). For those familiar with
> the other languages, has this kind of thing been done before?
> 
> I agree that the C# bindings could really benefit from having a JSON
> encoder. And yes, we should definitely take advantage of the JSON.net
> library. In implementation, I imagine this would take the form of new
> implementations of the *Encoder* and *Decoder* interfaces, *JsonEncoder *and
> *JsonDecoder*.
> 
> On Mon, Aug 5, 2019 at 11:56 PM Patrick Farry <pa...@gmail.com>
> wrote:
> 
>> Hi Brian,
>> 
>> I got the schema from C# source code generator working, at least well
>> enough for us. Not sure it is ready for general use but if you’re
>> interested I can send the code. What is the usual practice for things under
>> development - should I create a PR even if the code is some way off from
>> being mergeable or just a branch of my repo?
>> 
>> Our use case is that the developers write C# DTO classes and don’t need to
>> worry much about the schema. We build a protocol from the classes and then
>> schemas for each top level object from the protocol.
>> 
>> Also, I think we (Pitney Bowes) need to get something going about a JSON
>> encoder. Shifting to Avro is a bit of a stretch for some of our developers
>> since they can’t easily see the message contents without additional
>> decoding tools. Are you ok with basing it on JSON.net? It should be pretty
>> fast to get something working.
>> 
>>> On Jul 11, 2019, at 5:05 PM, Brian Lachniet <bl...@gmail.com> wrote:
>>> 
>>> Hi Patrick!
>>> 
>>> I'm sorry I haven't gotten back to the Reflect review yet. I'm hoping to
>>> get back on that this weekend.
>>> 
>>> That sounds like a good list of additions. I'm particularly interested in
>>> having support for JSON encoding/decoding in C#. I bet we could get a
>> long
>>> ways simply porting the JsonEncoder
>>> <
>> https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/io/JsonEncoder.java
>>> 
>>> from the Java bindings.
>>> 
>>> I agree that we need to bump the Newtonsoft.Json verison. That just came
>> up
>>> in another issue recently. What version of Newtonsoft.Json do you think
>> we
>>> should upgrade to? I'm a little wary of bumping all the way to the
>> latest,
>>> as that would force everyone using Avro to also go to the latest version.
>>> Maybe that's not a bad thing, I'm not sure.
>>> 
>>> I'm not familiar with Roslyn-based code generation. I'll do some Googling
>>> on that (if you've got some good articles or starting points to share,
>> I'd
>>> appreciate that). Codegen currently uses the System.CodeDom API. Would
>>> Roslyn-based code generation be a replacement for that, or am I
>>> misunderstanding something?
>>> 
>>> Thank you for your patience so far, Patrick. I'm looking forward to these
>>> new features too!
>>> 
>>> On Thu, Jul 11, 2019 at 6:14 PM Patrick Farry <patrick.s.farry@gmail.com
>>> 
>>> wrote:
>>> 
>>>> Hi Brian,
>>>> 
>>>> Not meaning to hassle you (well maybe a little). I’ve got a few other
>>>> enhancements we'd like to do as well as the Reflect code, and was hoping
>>>> not to have to maintain a bunch of branches.
>>>> 
>>>> Here are the planned changes:
>>>> - update newtonsoft.json and add the json path whenever the schema
>> parser
>>>> reports an error - we have a number of large schemas and debugging them
>> is
>>>> proving difficult
>>>> - Roslyn based schema generator from C# class
>>>> - Reflect option added to codegen (possibly updating codegen to use
>> Roslyn)
>>>> - Serialize to json
>>>> 
>>>> Thanks.
>>>> 
>>>>> On May 2, 2019, at 6:22 PM, Brian Lachniet <bl...@gmail.com>
>> wrote:
>>>>> 
>>>>> Hey Patrick,
>>>>> 
>>>>> This sounds very useful! I'd love to see this introduced to the C#
>>>> library.
>>>>> 
>>>>> On Thu, May 2, 2019 at 5:47 PM Ivan Greene <ig...@fanthreesixty.com>
>>>>> wrote:
>>>>> 
>>>>>> Patrick,
>>>>>> 
>>>>>> This sounds as though it would be the C# equivalent of Java's
>>>> ReflectData,
>>>>>> which has been part of the Avro API for several years now, so it’s
>>>> likely
>>>>>> there would be interest. Your best bet is to open a Jira describing
>> the
>>>>>> planned feature and open a pull request on Github to start the
>>>> discussion.
>>>>>> The contributing page on the wiki is a bit out of date and still
>>>> recommends
>>>>>> submitting patches on Jira but I believe that Github is now the
>> official
>>>>>> repository.
>>>>>> 
>>>>>>> On May 1, 2019, at 4:31 PM, Patrick Farry <patrick.s.farry@gmail.com
>>> 
>>>>>> wrote:
>>>>>>> 
>>>>>>> Hi All,
>>>>>>> 
>>>>>>> I have written code that implements Avro serialization for POCO
>> classes
>>>>>> - i.e. classes that are not generated by Avro codegen and do not
>>>> implement
>>>>>> ISpecificRecord. The idea was to make it work as much like JSON.net as
>>>>>> possible.
>>>>>>> 
>>>>>>> The serializer inherits from SpecificDefaultWriter and the
>> deserializer
>>>>>> from SpecificDefaultReader.
>>>>>>> 
>>>>>>> Avro fields are mapped to C# properties either by matching the field
>>>>>> name and property name or by using an attribute to specify the field
>>>>>> sequence number.
>>>>>>> 
>>>>>>> Is this something that would be of interest to the Avro project? My
>>>>>> company has approved committing the code and I’d be available and
>> happy
>>>> to
>>>>>> maintain this code and work on other parts of the C# implementation.
>>>>>>> 
>>>>>>> Regards.
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>>> --
>>>>> 
>>>>> [image: 51b630b05e01a6d5134ccfd520f547c4.png]
>>>>> 
>>>>> Brian Lachniet
>>>>> 
>>>>> Software Engineer
>>>>> 
>>>>> E: blachniet@gmail.com | blachniet.com <http://www.blachniet.com>
>>>>> 
>>>>> <https://twitter.com/blachniet> <http://www.linkedin.com/in/blachniet>
>>>> 
>>>> 
>>> 
>>> --
>>> 
>>> [image: 51b630b05e01a6d5134ccfd520f547c4.png]
>>> 
>>> Brian Lachniet
>>> 
>>> Software Engineer
>>> 
>>> E: blachniet@gmail.com | blachniet.com <http://www.blachniet.com>
>>> 
>>> <https://twitter.com/blachniet> <http://www.linkedin.com/in/blachniet>
>> 
>> 
> 
> -- 
> 
> [image: 51b630b05e01a6d5134ccfd520f547c4.png]
> 
> Brian Lachniet
> 
> Software Engineer
> 
> E: blachniet@gmail.com | blachniet.com <http://www.blachniet.com>
> 
> <https://twitter.com/blachniet> <http://www.linkedin.com/in/blachniet>


Re: C# POCO serializer/deserializer

Posted by Brian Lachniet <bl...@gmail.com>.
Hey Patrick,

That sounds pretty cool. I think that the best way to handle this right
now, if you believe there is additional work to make it ready for more
general use, is to keep it on a separate branch in your repository.

I wonder if this sort of functionality exists in other language bindings
(I'm really only familiar with the C# bindings). For those familiar with
the other languages, has this kind of thing been done before?

I agree that the C# bindings could really benefit from having a JSON
encoder. And yes, we should definitely take advantage of the JSON.net
library. In implementation, I imagine this would take the form of new
implementations of the *Encoder* and *Decoder* interfaces, *JsonEncoder *and
*JsonDecoder*.

On Mon, Aug 5, 2019 at 11:56 PM Patrick Farry <pa...@gmail.com>
wrote:

> Hi Brian,
>
> I got the schema from C# source code generator working, at least well
> enough for us. Not sure it is ready for general use but if you’re
> interested I can send the code. What is the usual practice for things under
> development - should I create a PR even if the code is some way off from
> being mergeable or just a branch of my repo?
>
> Our use case is that the developers write C# DTO classes and don’t need to
> worry much about the schema. We build a protocol from the classes and then
> schemas for each top level object from the protocol.
>
> Also, I think we (Pitney Bowes) need to get something going about a JSON
> encoder. Shifting to Avro is a bit of a stretch for some of our developers
> since they can’t easily see the message contents without additional
> decoding tools. Are you ok with basing it on JSON.net? It should be pretty
> fast to get something working.
>
> > On Jul 11, 2019, at 5:05 PM, Brian Lachniet <bl...@gmail.com> wrote:
> >
> > Hi Patrick!
> >
> > I'm sorry I haven't gotten back to the Reflect review yet. I'm hoping to
> > get back on that this weekend.
> >
> > That sounds like a good list of additions. I'm particularly interested in
> > having support for JSON encoding/decoding in C#. I bet we could get a
> long
> > ways simply porting the JsonEncoder
> > <
> https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/io/JsonEncoder.java
> >
> > from the Java bindings.
> >
> > I agree that we need to bump the Newtonsoft.Json verison. That just came
> up
> > in another issue recently. What version of Newtonsoft.Json do you think
> we
> > should upgrade to? I'm a little wary of bumping all the way to the
> latest,
> > as that would force everyone using Avro to also go to the latest version.
> > Maybe that's not a bad thing, I'm not sure.
> >
> > I'm not familiar with Roslyn-based code generation. I'll do some Googling
> > on that (if you've got some good articles or starting points to share,
> I'd
> > appreciate that). Codegen currently uses the System.CodeDom API. Would
> > Roslyn-based code generation be a replacement for that, or am I
> > misunderstanding something?
> >
> > Thank you for your patience so far, Patrick. I'm looking forward to these
> > new features too!
> >
> > On Thu, Jul 11, 2019 at 6:14 PM Patrick Farry <patrick.s.farry@gmail.com
> >
> > wrote:
> >
> >> Hi Brian,
> >>
> >> Not meaning to hassle you (well maybe a little). I’ve got a few other
> >> enhancements we'd like to do as well as the Reflect code, and was hoping
> >> not to have to maintain a bunch of branches.
> >>
> >> Here are the planned changes:
> >> - update newtonsoft.json and add the json path whenever the schema
> parser
> >> reports an error - we have a number of large schemas and debugging them
> is
> >> proving difficult
> >> - Roslyn based schema generator from C# class
> >> - Reflect option added to codegen (possibly updating codegen to use
> Roslyn)
> >> - Serialize to json
> >>
> >> Thanks.
> >>
> >>> On May 2, 2019, at 6:22 PM, Brian Lachniet <bl...@gmail.com>
> wrote:
> >>>
> >>> Hey Patrick,
> >>>
> >>> This sounds very useful! I'd love to see this introduced to the C#
> >> library.
> >>>
> >>> On Thu, May 2, 2019 at 5:47 PM Ivan Greene <ig...@fanthreesixty.com>
> >>> wrote:
> >>>
> >>>> Patrick,
> >>>>
> >>>> This sounds as though it would be the C# equivalent of Java's
> >> ReflectData,
> >>>> which has been part of the Avro API for several years now, so it’s
> >> likely
> >>>> there would be interest. Your best bet is to open a Jira describing
> the
> >>>> planned feature and open a pull request on Github to start the
> >> discussion.
> >>>> The contributing page on the wiki is a bit out of date and still
> >> recommends
> >>>> submitting patches on Jira but I believe that Github is now the
> official
> >>>> repository.
> >>>>
> >>>>> On May 1, 2019, at 4:31 PM, Patrick Farry <patrick.s.farry@gmail.com
> >
> >>>> wrote:
> >>>>>
> >>>>> Hi All,
> >>>>>
> >>>>> I have written code that implements Avro serialization for POCO
> classes
> >>>> - i.e. classes that are not generated by Avro codegen and do not
> >> implement
> >>>> ISpecificRecord. The idea was to make it work as much like JSON.net as
> >>>> possible.
> >>>>>
> >>>>> The serializer inherits from SpecificDefaultWriter and the
> deserializer
> >>>> from SpecificDefaultReader.
> >>>>>
> >>>>> Avro fields are mapped to C# properties either by matching the field
> >>>> name and property name or by using an attribute to specify the field
> >>>> sequence number.
> >>>>>
> >>>>> Is this something that would be of interest to the Avro project? My
> >>>> company has approved committing the code and I’d be available and
> happy
> >> to
> >>>> maintain this code and work on other parts of the C# implementation.
> >>>>>
> >>>>> Regards.
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>
> >>> --
> >>>
> >>> [image: 51b630b05e01a6d5134ccfd520f547c4.png]
> >>>
> >>> Brian Lachniet
> >>>
> >>> Software Engineer
> >>>
> >>> E: blachniet@gmail.com | blachniet.com <http://www.blachniet.com>
> >>>
> >>> <https://twitter.com/blachniet> <http://www.linkedin.com/in/blachniet>
> >>
> >>
> >
> > --
> >
> > [image: 51b630b05e01a6d5134ccfd520f547c4.png]
> >
> > Brian Lachniet
> >
> > Software Engineer
> >
> > E: blachniet@gmail.com | blachniet.com <http://www.blachniet.com>
> >
> > <https://twitter.com/blachniet> <http://www.linkedin.com/in/blachniet>
>
>

-- 

[image: 51b630b05e01a6d5134ccfd520f547c4.png]

Brian Lachniet

Software Engineer

E: blachniet@gmail.com | blachniet.com <http://www.blachniet.com>

<https://twitter.com/blachniet> <http://www.linkedin.com/in/blachniet>