You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hudi.apache.org by Raymond Xu <xu...@gmail.com> on 2021/07/03 21:10:27 UTC

[DISCUSS] scenario-based quickstart demo

I found the demo setup in the "docker" directory not beginner friendly. It
took some effort to digest what's there and it's hard to play with.
Proposing some scenario-based quickstart setup

- Scenario 1: DeltaStreamer write
  - sample raw dataset, local FS
  - run deltastreamer with local Spark or Flink write to COW or MOR
- Scenario 2: meta sync
  - sample hoodie table (COW or MOR), local FS
  - run hive sync with local Hive server
- Scenario 3: SQL read
  - sample hoodie table (COW or MOR), local FS
  - run local Trino/Presto queries
- More scenarios: incremental read, clustering, etc

In all scenarios, users can choose between a release version and the local
version of Hudi.

Not meant to replace the current "docker" demo. It can be under a
"quickstart" dir and aims to be more focused quick sandbox. A typical dev
flow is
1. changed some code
2. run mvn install -DskipTests
3. play with affected scenarios to verify the change

Any thoughts or comments? Thank you.

Re: [DISCUSS] scenario-based quickstart demo

Posted by Sivabalan <n....@gmail.com>.
+1 agree we don't have recipes for each feature as such. would benefit
users who are interested in a particular feature.

On Tue, Jul 6, 2021 at 2:17 AM Vinoth Chandar <vi...@apache.org> wrote:

> Hi Raymond,
>
> Are you suggesting a fix to the dev workflow or general site/quickstart
> docs?
>
> Agree, that the current doc is all-at-once and at least better docs on
> incrementally testing parts could be useful.
> It takes a while to learn what to skip and what not to.
>
> Thanks
> Vinoth
>
> On Sat, Jul 3, 2021 at 2:11 PM Raymond Xu <xu...@gmail.com>
> wrote:
>
> > I found the demo setup in the "docker" directory not beginner friendly.
> It
> > took some effort to digest what's there and it's hard to play with.
> > Proposing some scenario-based quickstart setup
> >
> > - Scenario 1: DeltaStreamer write
> >   - sample raw dataset, local FS
> >   - run deltastreamer with local Spark or Flink write to COW or MOR
> > - Scenario 2: meta sync
> >   - sample hoodie table (COW or MOR), local FS
> >   - run hive sync with local Hive server
> > - Scenario 3: SQL read
> >   - sample hoodie table (COW or MOR), local FS
> >   - run local Trino/Presto queries
> > - More scenarios: incremental read, clustering, etc
> >
> > In all scenarios, users can choose between a release version and the
> local
> > version of Hudi.
> >
> > Not meant to replace the current "docker" demo. It can be under a
> > "quickstart" dir and aims to be more focused quick sandbox. A typical dev
> > flow is
> > 1. changed some code
> > 2. run mvn install -DskipTests
> > 3. play with affected scenarios to verify the change
> >
> > Any thoughts or comments? Thank you.
> >
>


-- 
Regards,
-Sivabalan

Re: [DISCUSS] scenario-based quickstart demo

Posted by Vinoth Chandar <vi...@apache.org>.
Hi Raymond,

Are you suggesting a fix to the dev workflow or general site/quickstart
docs?

Agree, that the current doc is all-at-once and at least better docs on
incrementally testing parts could be useful.
It takes a while to learn what to skip and what not to.

Thanks
Vinoth

On Sat, Jul 3, 2021 at 2:11 PM Raymond Xu <xu...@gmail.com>
wrote:

> I found the demo setup in the "docker" directory not beginner friendly. It
> took some effort to digest what's there and it's hard to play with.
> Proposing some scenario-based quickstart setup
>
> - Scenario 1: DeltaStreamer write
>   - sample raw dataset, local FS
>   - run deltastreamer with local Spark or Flink write to COW or MOR
> - Scenario 2: meta sync
>   - sample hoodie table (COW or MOR), local FS
>   - run hive sync with local Hive server
> - Scenario 3: SQL read
>   - sample hoodie table (COW or MOR), local FS
>   - run local Trino/Presto queries
> - More scenarios: incremental read, clustering, etc
>
> In all scenarios, users can choose between a release version and the local
> version of Hudi.
>
> Not meant to replace the current "docker" demo. It can be under a
> "quickstart" dir and aims to be more focused quick sandbox. A typical dev
> flow is
> 1. changed some code
> 2. run mvn install -DskipTests
> 3. play with affected scenarios to verify the change
>
> Any thoughts or comments? Thank you.
>