You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2020/08/20 15:29:34 UTC

[GitHub] [hudi] KarthickAN commented on issue #1977: Error running hudi on aws glue

KarthickAN commented on issue #1977:
URL: https://github.com/apache/hudi/issues/1977#issuecomment-677735782

@vinothchandar @umehrot2 Thank you for responding. I was able to run it after I uploaded a custom built jar by adding the following in the pom

<relocation>
<pattern>org.eclipse.jetty.</pattern>
<shadedPattern>org.apache.hudi.org.eclipse.jetty.</shadedPattern>
</relocation>

like mentioned in the following issue thread https://github.com/apache/hudi/issues/1789.

As of now I am evaluating hudi with our existing data lake architecture. It seems to integrate well. But I do have few queries though

1. Athena and Redshift spectrum supports querying CoW hudi table now. Will it add support for MoR as well in the near future ?
2. Will AWS Glue natively support hudi in the future ?
3. Will you be adding support for cloudwatch metrics ?
4. Is it possible to enable hive sync feature for AWS Glue Catalog ?
5. I also tried running the 0.6.0-rc1 build on AWS Glue. The job kept on running even after it has written the data successfully. Then I had to kill it manually. Is there a async feature that runs by default in this new version ?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org