You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hudi.apache.org by Joaquim S <jo...@gmail.com> on 2020/03/24 18:06:57 UTC

DMS - org.apache.hudi.exception.HoodieException: Please provide a valid schema provider class

Team,

When following the blog "Change Capture Using AWS Database Migration
Service and Hudi" with my own data set, the initial load works perfectly.
When issuing the command with the DMS CDC files on S3, I get the following
error:

20/03/24 17:56:28 ERROR HoodieDeltaStreamer: Got error running delta sync
once. Shutting down
org.apache.hudi.exception.HoodieException: Please provide a valid schema
provider class! at
org.apache.hudi.utilities.sources.InputBatch.getSchemaProvider(InputBatch.java:53)
 at
org.apache.hudi.utilities.deltastreamer.DeltaSync.readFromSource(DeltaSync.java:312)
at
org.apache.hudi.utilities.deltastreamer.DeltaSync.syncOnce(DeltaSync.java:226)

I tried using the  --schemaprovider-class
org.apache.hudi.utilities.schema.FilebasedSchemaProvider.Source and provide
the schema. The error does not occur but there are no write to Hudi.

I am not performing any transformations (other than the DMS transform) and
using default record key strategy.

If the team has any pointers, please let me know.

Thank you!

Re: DMS - org.apache.hudi.exception.HoodieException: Please provide a valid schema provider class

Posted by Vinoth Chandar <vi...@apache.org>.
Thanks! https://jira.apache.org/jira/browse/HUDI-735 filed to make that
exception more user friendly.

Please have at it, if anyone is interested in this :)

On Wed, Mar 25, 2020 at 11:40 AM Joaquim S <jo...@gmail.com> wrote:

> Thank you Vinoth. I was able to find the issue. All my column names were in
> high caps case. I switched column names and table names to lower case and
> it works perfectly.
>
>
>
> Vinoth Chandar <vi...@apache.org> escreveu no dia quarta, 25/03/2020 à(s)
> 11:04:
>
> > Hi,
> >
> > That's surprising..Do you have --source-class
> > org.apache.hudi.utilities.sources.ParquetDFSSource?
> > I ask sine for Row based sources, the schema provider is auto configured
> as
> > show in the blog page..
> >
> > Thanks
> > VInoth
> >
> > On Tue, Mar 24, 2020 at 11:07 AM Joaquim S <jo...@gmail.com> wrote:
> >
> > > Team,
> > >
> > > When following the blog "Change Capture Using AWS Database Migration
> > > Service and Hudi" with my own data set, the initial load works
> perfectly.
> > > When issuing the command with the DMS CDC files on S3, I get the
> > following
> > > error:
> > >
> > > 20/03/24 17:56:28 ERROR HoodieDeltaStreamer: Got error running delta
> sync
> > > once. Shutting down
> > > org.apache.hudi.exception.HoodieException: Please provide a valid
> schema
> > > provider class! at
> > >
> > >
> >
> org.apache.hudi.utilities.sources.InputBatch.getSchemaProvider(InputBatch.java:53)
> > >  at
> > >
> > >
> >
> org.apache.hudi.utilities.deltastreamer.DeltaSync.readFromSource(DeltaSync.java:312)
> > > at
> > >
> > >
> >
> org.apache.hudi.utilities.deltastreamer.DeltaSync.syncOnce(DeltaSync.java:226)
> > >
> > > I tried using the  --schemaprovider-class
> > > org.apache.hudi.utilities.schema.FilebasedSchemaProvider.Source and
> > provide
> > > the schema. The error does not occur but there are no write to Hudi.
> > >
> > > I am not performing any transformations (other than the DMS transform)
> > and
> > > using default record key strategy.
> > >
> > > If the team has any pointers, please let me know.
> > >
> > > Thank you!
> > >
> >
>

Re: DMS - org.apache.hudi.exception.HoodieException: Please provide a valid schema provider class

Posted by Joaquim S <jo...@gmail.com>.
Thank you Vinoth. I was able to find the issue. All my column names were in
high caps case. I switched column names and table names to lower case and
it works perfectly.



Vinoth Chandar <vi...@apache.org> escreveu no dia quarta, 25/03/2020 à(s)
11:04:

> Hi,
>
> That's surprising..Do you have --source-class
> org.apache.hudi.utilities.sources.ParquetDFSSource?
> I ask sine for Row based sources, the schema provider is auto configured as
> show in the blog page..
>
> Thanks
> VInoth
>
> On Tue, Mar 24, 2020 at 11:07 AM Joaquim S <jo...@gmail.com> wrote:
>
> > Team,
> >
> > When following the blog "Change Capture Using AWS Database Migration
> > Service and Hudi" with my own data set, the initial load works perfectly.
> > When issuing the command with the DMS CDC files on S3, I get the
> following
> > error:
> >
> > 20/03/24 17:56:28 ERROR HoodieDeltaStreamer: Got error running delta sync
> > once. Shutting down
> > org.apache.hudi.exception.HoodieException: Please provide a valid schema
> > provider class! at
> >
> >
> org.apache.hudi.utilities.sources.InputBatch.getSchemaProvider(InputBatch.java:53)
> >  at
> >
> >
> org.apache.hudi.utilities.deltastreamer.DeltaSync.readFromSource(DeltaSync.java:312)
> > at
> >
> >
> org.apache.hudi.utilities.deltastreamer.DeltaSync.syncOnce(DeltaSync.java:226)
> >
> > I tried using the  --schemaprovider-class
> > org.apache.hudi.utilities.schema.FilebasedSchemaProvider.Source and
> provide
> > the schema. The error does not occur but there are no write to Hudi.
> >
> > I am not performing any transformations (other than the DMS transform)
> and
> > using default record key strategy.
> >
> > If the team has any pointers, please let me know.
> >
> > Thank you!
> >
>

Re: DMS - org.apache.hudi.exception.HoodieException: Please provide a valid schema provider class

Posted by Vinoth Chandar <vi...@apache.org>.
Hi,

That's surprising..Do you have --source-class
org.apache.hudi.utilities.sources.ParquetDFSSource?
I ask sine for Row based sources, the schema provider is auto configured as
show in the blog page..

Thanks
VInoth

On Tue, Mar 24, 2020 at 11:07 AM Joaquim S <jo...@gmail.com> wrote:

> Team,
>
> When following the blog "Change Capture Using AWS Database Migration
> Service and Hudi" with my own data set, the initial load works perfectly.
> When issuing the command with the DMS CDC files on S3, I get the following
> error:
>
> 20/03/24 17:56:28 ERROR HoodieDeltaStreamer: Got error running delta sync
> once. Shutting down
> org.apache.hudi.exception.HoodieException: Please provide a valid schema
> provider class! at
>
> org.apache.hudi.utilities.sources.InputBatch.getSchemaProvider(InputBatch.java:53)
>  at
>
> org.apache.hudi.utilities.deltastreamer.DeltaSync.readFromSource(DeltaSync.java:312)
> at
>
> org.apache.hudi.utilities.deltastreamer.DeltaSync.syncOnce(DeltaSync.java:226)
>
> I tried using the  --schemaprovider-class
> org.apache.hudi.utilities.schema.FilebasedSchemaProvider.Source and provide
> the schema. The error does not occur but there are no write to Hudi.
>
> I am not performing any transformations (other than the DMS transform) and
> using default record key strategy.
>
> If the team has any pointers, please let me know.
>
> Thank you!
>