You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Tarrence van As <ta...@vanas.family> on 2021/07/31 22:36:46 UTC

Golang: Custom Type Coder

Hi,

Thank you for the great work on Beam. I'm using the golang sdk and hoping
someone might be able to help with defining a customer coder for a type.

I have a type like this:
type Execer struct {
Query string `json:"query,omitempty"`
Args []interface{} `json:"args,omitempty"`
}

Where each element in Args implements `driver.Valuer`.

When trying to run with this as an element of a PCollection I get:

panic: unable to encode type: interface {}

My guess is that I need to define an encode/decode method for the type, but
I'm not sure what interface it should implement. Any direction would be
appreciated.

Tarrence

Re: Golang: Custom Type Coder

Posted by Robert Burke <lo...@apache.org>.
Hi Tarrence! Thank you. I have a few answers to your questions.
1. The Go Direct Runner doesn't test coders or serialization, and outright avoids it, so it's erroneously passing in cases like this.

2. After about version 2.30 IIRC the Go SDK stopped using JSON as the default coder for struct types in favour of Beam Schema Row coders. As a more compact binary encoding it's much more efficient than JSON, largely because the types are static thoughout a pipeline. However, as a result it's unable to handle interface types. JSON gets away with interface{} types because it encodes the field names for every element, among other things.

So my recommendation would be to not lean on interfaces for elements within pipelines.

That said, one is always free to create and register a custom coder though that uses JSON under the hood, especially if the elements are few, or as I'm guessing, are shelling out to some other process in a custom container...

eg.
```
func init() {
  beam.RegisterCoder(reflect.TypeOf(Execer{}), jsonEnc, jsonDec)
}

func jsonEnc(in beam.T) ([]byte, error) {
       v := in.(Execer)
	// ... json marshalling code ...
}

func jsonDecValue(in []byte) (beam.T, error) {
      v := Execer{}
	// ... json unmarshalling code ...
      return v, nil
}
```

Which should get you the desired behavior on distributed runners.

Cheers,
Robert Burke
Beam Go Busybody

On 2021/08/01 01:18:40, Tarrence van As <ta...@vanas.family> wrote: 
> An additional piece of information that might be of interest:
> 
> This issue occurs if I try to create a PCollection with `beam.Create(s,
> Execer{q, a})` but seems to work (at least with the direct runner) when I
> emit the struct like `emit(Execer{q, a})`.
> 
> Tarrence
> 
> On Sat, Jul 31, 2021 at 3:36 PM Tarrence van As <ta...@vanas.family>
> wrote:
> 
> > Hi,
> >
> > Thank you for the great work on Beam. I'm using the golang sdk and hoping
> > someone might be able to help with defining a customer coder for a type.
> >
> > I have a type like this:
> > type Execer struct {
> > Query string `json:"query,omitempty"`
> > Args []interface{} `json:"args,omitempty"`
> > }
> >
> > Where each element in Args implements `driver.Valuer`.
> >
> > When trying to run with this as an element of a PCollection I get:
> >
> > panic: unable to encode type: interface {}
> >
> > My guess is that I need to define an encode/decode method for the type,
> > but I'm not sure what interface it should implement. Any direction would be
> > appreciated.
> >
> > Tarrence
> >
> >
> 

Re: Golang: Custom Type Coder

Posted by Tarrence van As <ta...@vanas.family>.
An additional piece of information that might be of interest:

This issue occurs if I try to create a PCollection with `beam.Create(s,
Execer{q, a})` but seems to work (at least with the direct runner) when I
emit the struct like `emit(Execer{q, a})`.

Tarrence

On Sat, Jul 31, 2021 at 3:36 PM Tarrence van As <ta...@vanas.family>
wrote:

> Hi,
>
> Thank you for the great work on Beam. I'm using the golang sdk and hoping
> someone might be able to help with defining a customer coder for a type.
>
> I have a type like this:
> type Execer struct {
> Query string `json:"query,omitempty"`
> Args []interface{} `json:"args,omitempty"`
> }
>
> Where each element in Args implements `driver.Valuer`.
>
> When trying to run with this as an element of a PCollection I get:
>
> panic: unable to encode type: interface {}
>
> My guess is that I need to define an encode/decode method for the type,
> but I'm not sure what interface it should implement. Any direction would be
> appreciated.
>
> Tarrence
>
>