You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@hudi.apache.org by Vinoth Chandar <vi...@apache.org> on 2020/07/13 01:37:04 UTC

Re: Question on hoodie.deltastreamer.schemaprovider.[source|target].schema.file with Parquet sources

Hi,

cc-ing users@ , where these questions can be directed to in the future.

> I do have a sql transform in the mix but both input and output schemas
are ignored. Is this expected?

if we have a Dataset<Row> then yes,we just implicitly use that. do you have
an use case that we are not supporting today or is the job failing?
If so, can you share the exception trace.

Thanks
Vinoth



On Sat, Jul 11, 2020 at 12:25 PM Joaquim S <jo...@gmail.com> wrote:

> Hi!
>
> I am trying to understand how to leverage
> hoodie.deltastreamer.schemaprovider.[source|target].schema.file with
> parquet sources.
>
> During my tests, it does not really matter the avro schemas for input and
> output as they are continuously ignored.
>
> I understand that for Sources that return Dataset<Row>, the schema is
> obtained implicitly. I was expecting that to happen. I do have a sql
> transform in the mix but both input and output schemas are ignored. Is this
> expected? Should i use a different schema provider when sql transforms are
> in the mix?
>
> Thank you!
>