You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "sydneyhoran (via GitHub)" <gi...@apache.org> on 2023/04/20 18:01:47 UTC

[GitHub] [hudi] sydneyhoran commented on issue #661: Tracking ticket for reporting Hudi usages from the community

sydneyhoran commented on issue #661:
URL: https://github.com/apache/hudi/issues/661#issuecomment-1516735390

   Our data engineering team at [Penn Interactive](https://www.penn-interactive.com/)/[TheScore](https://www.thescore.com/) is currently developing a new data platform after the combination of our two companies. We are in the industry of online and retail sports betting and sports media based in the US and Canada.
   
   We are implementing a Hudi datalake as the foundational data layer of our analytics and reporting platform, using Deltastreamer and other Hudi Spark jobs to ingest data. We are streaming CDC logs from approximately 1200 tables from 75 Postgres databases within the company using PostgresDebeziumSource from Confluent Cloud Kafka topics. We are also using Deltastreamer for multiple batch ingestion jobs to further enrich the datalake.
   
   The new data platform will power the business intelligence and compliance/reporting operations for [TheScoreBet](https://thescore.bet/) and [Barstool Sportsbook](https://www.barstoolsportsbook.com/), subsidiary companies of [Penn Entertainment](https://www.pennentertainment.com/).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org