You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@amaterasu.apache.org by Nadav Har Tzvi <na...@gmail.com> on 2019/03/18 18:12:48 UTC

[Discuss] datasets input file in user's repository

Hi,

Just wanna open this up for discussion as it seems we somehow skipped this
point.
Basically, by now we pretty much have the new datasets APIs in place in the
Python SDK and in implementing frameworks. (amaterasu-pyspark,
amaterasu-pandas, amaterasu-python)
The only question left is regarding the way we get the datasets definitions.
Currently, we still look up the datasets definitions in the maki file,
under the action's exports.
Do we intend to keep it that way? I assume not as I think that every action
needs access to all defined datasets.
In that case, how will the user submit datasets configuration? Is it
another file next to the maki.yaml? Is it a file that resides in the
environment, e.g. next to the env.yaml? Is it not even a file on its own
but a part of the env.yaml?
Ideas, anyone?

Let's discuss this please!

Cheers,
Nadav

Re: [Discuss] datasets input file in user's repository

Posted by Nadav Har Tzvi <na...@gmail.com>.
I assumed as much.
I will implement a change to the leader to load the data from the env, it
will become a part of the PR for Amaterasu-46.

Cheers,
Nadav



On Tue, 19 Mar 2019 at 01:36, Yaniv Rodenski <ya...@shinto.io> wrote:

> Hi Nadav,
>
> I think datasets should be per environment, (for example, it is very common
> to use different databases for dev/test/prod), so I think that datasets as
> configurations in Amaterasu should sit under env).
>
> Cheers,
> Yaniv
>
> On Tue, Mar 19, 2019 at 5:13 AM Nadav Har Tzvi <na...@gmail.com>
> wrote:
>
> > Hi,
> >
> > Just wanna open this up for discussion as it seems we somehow skipped
> this
> > point.
> > Basically, by now we pretty much have the new datasets APIs in place in
> the
> > Python SDK and in implementing frameworks. (amaterasu-pyspark,
> > amaterasu-pandas, amaterasu-python)
> > The only question left is regarding the way we get the datasets
> > definitions.
> > Currently, we still look up the datasets definitions in the maki file,
> > under the action's exports.
> > Do we intend to keep it that way? I assume not as I think that every
> action
> > needs access to all defined datasets.
> > In that case, how will the user submit datasets configuration? Is it
> > another file next to the maki.yaml? Is it a file that resides in the
> > environment, e.g. next to the env.yaml? Is it not even a file on its own
> > but a part of the env.yaml?
> > Ideas, anyone?
> >
> > Let's discuss this please!
> >
> > Cheers,
> > Nadav
> >
>
>
> --
> Yaniv Rodenski
>
> +61 477 778 405
> yaniv@shinto.io
>

Re: [Discuss] datasets input file in user's repository

Posted by Yaniv Rodenski <ya...@shinto.io>.
Hi Nadav,

I think datasets should be per environment, (for example, it is very common
to use different databases for dev/test/prod), so I think that datasets as
configurations in Amaterasu should sit under env).

Cheers,
Yaniv

On Tue, Mar 19, 2019 at 5:13 AM Nadav Har Tzvi <na...@gmail.com>
wrote:

> Hi,
>
> Just wanna open this up for discussion as it seems we somehow skipped this
> point.
> Basically, by now we pretty much have the new datasets APIs in place in the
> Python SDK and in implementing frameworks. (amaterasu-pyspark,
> amaterasu-pandas, amaterasu-python)
> The only question left is regarding the way we get the datasets
> definitions.
> Currently, we still look up the datasets definitions in the maki file,
> under the action's exports.
> Do we intend to keep it that way? I assume not as I think that every action
> needs access to all defined datasets.
> In that case, how will the user submit datasets configuration? Is it
> another file next to the maki.yaml? Is it a file that resides in the
> environment, e.g. next to the env.yaml? Is it not even a file on its own
> but a part of the env.yaml?
> Ideas, anyone?
>
> Let's discuss this please!
>
> Cheers,
> Nadav
>


-- 
Yaniv Rodenski

+61 477 778 405
yaniv@shinto.io