You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@libcloud.apache.org by Tomaz Muraus <to...@apache.org> on 2020/07/01 10:00:52 UTC

[dev] Publishing up to date pricing data (pricing.json file) to a well known location

Recently one of the Libcloud contributors (Eis-D-Z) published various
improvements to our price scraping scripts and added some new ones -
https://github.com/apache/libcloud/pulls/Eis-D-Z.

I think it would now make sense to run those scraping scripts on a
continuous basis as part of our CI (e.g. once a day) and publish the
generated file to some well known location (e.g. public read-only S3
bucket).

In fact, that was also the plan when we originally
added libcloud.pricing.download_pricing_file function and related
functionality quite a long time ago.

IIRC, the plan was to include an auto-generated pricing file directly
inside the git repo, but this is more complicated and I would need to
contact the ASF infra team if they even allow something like that (updating
and committing a change as a bot user on our CI - Travis CI).

So for now, I will probably just publish this auto-generated pricing.json
file to a public read-only S3 bucket (I will make sure to set up correct
rate limits and alerts to prevent abuse, even though the pricing file
itself is quite small).

What do other people think?

Re: [dev] Publishing up to date pricing data (pricing.json file) to a well known location

Posted by Tomaz Muraus <to...@apache.org>.
Those files have now been made public.

I will publish a blog post with some details on that in the near future.

On Sat, Jul 11, 2020 at 8:48 PM Tomaz Muraus <to...@apache.org> wrote:

> I added some information on this new behavior here -
> https://github.com/apache/libcloud/blob/f122600d2adf181a9b100cdd552cd02979c5b1b9/docs/compute/pricing.rst#downloading-latest-pricing-data-from-an-s3-bucket
>
> Keep in mind that those 3 files are not public yet. I plan to make them
> public and read-only in the near future once the rate limits are sorted out.
>
> On Sat, Jul 11, 2020 at 4:07 PM Tomaz Muraus <to...@apache.org> wrote:
>
>> Yeah, I would actually prefer a git repository so everything is version
>> controlled, etc., but I went with the fastest and simplest approach
>> possible.
>>
>> I'm not exactly sure what the ASF rules are for something like that (I
>> would need to ask ASF infra team to create a new repo, create a bot account
>> which we could use in our CI, etc.) and that would likely take much longer
>> than the approach I went with.
>>
>> As far as libraries such as pytz (and to some extent also certifi) go - I
>> would say it's a slightly different there - time zones tend to change much
>> less frequently than provider pricings so publishing a new library package
>> every now and then is probably sufficient.
>>
>> On Sat, Jul 11, 2020 at 2:38 PM Samuel Marks <sa...@gmail.com>
>> wrote:
>>
>>> The other solution is to create a new git repository just for frequently
>>> updated files like this one… I mean we don't want to end up like pytz do
>>> we?
>>>
>>> PS: A good thing about pytz is other languages literally just parse
>>> pytz's
>>> list for their own timezone implementation. No Python. Easy! - With this
>>> being in JSON, I could imagine using Terraform libraries in Go; instead
>>> of
>>> Libcloud; to do multicloud, and use this costing system to say where and
>>> when.
>>>
>>> Samuel Marks
>>> Charity <https://sydneyscientific.org> | consultancy <
>>> https://offscale.io>
>>> | open-source <https://github.com/offscale> | LinkedIn
>>> <https://linkedin.com/in/samuelmarks>
>>>
>>>
>>> On Thu, Jul 2, 2020 at 9:51 PM Jay Rolette <ro...@infinite.io> wrote:
>>>
>>> > Same here!
>>> >
>>> > Thanks,
>>> > Jay
>>> >
>>> > On Wed, Jul 1, 2020 at 12:45 PM Francisco Ros <fj...@doalitic.com>
>>> wrote:
>>> >
>>> > > Hey Tomaz,
>>> > >
>>> > > I'd really love to see this :-)
>>> > >
>>> > > Thanks,
>>> > > Francisco
>>> > >
>>> > > > El 1 jul 2020, a las 12:00, Tomaz Muraus <to...@apache.org>
>>> escribió:
>>> > > >
>>> > > > Recently one of the Libcloud contributors (Eis-D-Z) published
>>> various
>>> > > > improvements to our price scraping scripts and added some new ones
>>> -
>>> > > > https://github.com/apache/libcloud/pulls/Eis-D-Z.
>>> > > >
>>> > > > I think it would now make sense to run those scraping scripts on a
>>> > > > continuous basis as part of our CI (e.g. once a day) and publish
>>> the
>>> > > > generated file to some well known location (e.g. public read-only
>>> S3
>>> > > > bucket).
>>> > > >
>>> > > > In fact, that was also the plan when we originally
>>> > > > added libcloud.pricing.download_pricing_file function and related
>>> > > > functionality quite a long time ago.
>>> > > >
>>> > > > IIRC, the plan was to include an auto-generated pricing file
>>> directly
>>> > > > inside the git repo, but this is more complicated and I would need
>>> to
>>> > > > contact the ASF infra team if they even allow something like that
>>> > > (updating
>>> > > > and committing a change as a bot user on our CI - Travis CI).
>>> > > >
>>> > > > So for now, I will probably just publish this auto-generated
>>> > pricing.json
>>> > > > file to a public read-only S3 bucket (I will make sure to set up
>>> > correct
>>> > > > rate limits and alerts to prevent abuse, even though the pricing
>>> file
>>> > > > itself is quite small).
>>> > > >
>>> > > > What do other people think?
>>> > >
>>> > >
>>> >
>>>
>>

Re: [dev] Publishing up to date pricing data (pricing.json file) to a well known location

Posted by Tomaz Muraus <to...@apache.org>.
I added some information on this new behavior here -
https://github.com/apache/libcloud/blob/f122600d2adf181a9b100cdd552cd02979c5b1b9/docs/compute/pricing.rst#downloading-latest-pricing-data-from-an-s3-bucket

Keep in mind that those 3 files are not public yet. I plan to make them
public and read-only in the near future once the rate limits are sorted out.

On Sat, Jul 11, 2020 at 4:07 PM Tomaz Muraus <to...@apache.org> wrote:

> Yeah, I would actually prefer a git repository so everything is version
> controlled, etc., but I went with the fastest and simplest approach
> possible.
>
> I'm not exactly sure what the ASF rules are for something like that (I
> would need to ask ASF infra team to create a new repo, create a bot account
> which we could use in our CI, etc.) and that would likely take much longer
> than the approach I went with.
>
> As far as libraries such as pytz (and to some extent also certifi) go - I
> would say it's a slightly different there - time zones tend to change much
> less frequently than provider pricings so publishing a new library package
> every now and then is probably sufficient.
>
> On Sat, Jul 11, 2020 at 2:38 PM Samuel Marks <sa...@gmail.com>
> wrote:
>
>> The other solution is to create a new git repository just for frequently
>> updated files like this one… I mean we don't want to end up like pytz do
>> we?
>>
>> PS: A good thing about pytz is other languages literally just parse pytz's
>> list for their own timezone implementation. No Python. Easy! - With this
>> being in JSON, I could imagine using Terraform libraries in Go; instead of
>> Libcloud; to do multicloud, and use this costing system to say where and
>> when.
>>
>> Samuel Marks
>> Charity <https://sydneyscientific.org> | consultancy <https://offscale.io
>> >
>> | open-source <https://github.com/offscale> | LinkedIn
>> <https://linkedin.com/in/samuelmarks>
>>
>>
>> On Thu, Jul 2, 2020 at 9:51 PM Jay Rolette <ro...@infinite.io> wrote:
>>
>> > Same here!
>> >
>> > Thanks,
>> > Jay
>> >
>> > On Wed, Jul 1, 2020 at 12:45 PM Francisco Ros <fj...@doalitic.com>
>> wrote:
>> >
>> > > Hey Tomaz,
>> > >
>> > > I'd really love to see this :-)
>> > >
>> > > Thanks,
>> > > Francisco
>> > >
>> > > > El 1 jul 2020, a las 12:00, Tomaz Muraus <to...@apache.org>
>> escribió:
>> > > >
>> > > > Recently one of the Libcloud contributors (Eis-D-Z) published
>> various
>> > > > improvements to our price scraping scripts and added some new ones -
>> > > > https://github.com/apache/libcloud/pulls/Eis-D-Z.
>> > > >
>> > > > I think it would now make sense to run those scraping scripts on a
>> > > > continuous basis as part of our CI (e.g. once a day) and publish the
>> > > > generated file to some well known location (e.g. public read-only S3
>> > > > bucket).
>> > > >
>> > > > In fact, that was also the plan when we originally
>> > > > added libcloud.pricing.download_pricing_file function and related
>> > > > functionality quite a long time ago.
>> > > >
>> > > > IIRC, the plan was to include an auto-generated pricing file
>> directly
>> > > > inside the git repo, but this is more complicated and I would need
>> to
>> > > > contact the ASF infra team if they even allow something like that
>> > > (updating
>> > > > and committing a change as a bot user on our CI - Travis CI).
>> > > >
>> > > > So for now, I will probably just publish this auto-generated
>> > pricing.json
>> > > > file to a public read-only S3 bucket (I will make sure to set up
>> > correct
>> > > > rate limits and alerts to prevent abuse, even though the pricing
>> file
>> > > > itself is quite small).
>> > > >
>> > > > What do other people think?
>> > >
>> > >
>> >
>>
>

Re: [dev] Publishing up to date pricing data (pricing.json file) to a well known location

Posted by Tomaz Muraus <to...@apache.org>.
Yeah, I would actually prefer a git repository so everything is version
controlled, etc., but I went with the fastest and simplest approach
possible.

I'm not exactly sure what the ASF rules are for something like that (I
would need to ask ASF infra team to create a new repo, create a bot account
which we could use in our CI, etc.) and that would likely take much longer
than the approach I went with.

As far as libraries such as pytz (and to some extent also certifi) go - I
would say it's a slightly different there - time zones tend to change much
less frequently than provider pricings so publishing a new library package
every now and then is probably sufficient.

On Sat, Jul 11, 2020 at 2:38 PM Samuel Marks <sa...@gmail.com> wrote:

> The other solution is to create a new git repository just for frequently
> updated files like this one… I mean we don't want to end up like pytz do
> we?
>
> PS: A good thing about pytz is other languages literally just parse pytz's
> list for their own timezone implementation. No Python. Easy! - With this
> being in JSON, I could imagine using Terraform libraries in Go; instead of
> Libcloud; to do multicloud, and use this costing system to say where and
> when.
>
> Samuel Marks
> Charity <https://sydneyscientific.org> | consultancy <https://offscale.io>
> | open-source <https://github.com/offscale> | LinkedIn
> <https://linkedin.com/in/samuelmarks>
>
>
> On Thu, Jul 2, 2020 at 9:51 PM Jay Rolette <ro...@infinite.io> wrote:
>
> > Same here!
> >
> > Thanks,
> > Jay
> >
> > On Wed, Jul 1, 2020 at 12:45 PM Francisco Ros <fj...@doalitic.com>
> wrote:
> >
> > > Hey Tomaz,
> > >
> > > I'd really love to see this :-)
> > >
> > > Thanks,
> > > Francisco
> > >
> > > > El 1 jul 2020, a las 12:00, Tomaz Muraus <to...@apache.org>
> escribió:
> > > >
> > > > Recently one of the Libcloud contributors (Eis-D-Z) published various
> > > > improvements to our price scraping scripts and added some new ones -
> > > > https://github.com/apache/libcloud/pulls/Eis-D-Z.
> > > >
> > > > I think it would now make sense to run those scraping scripts on a
> > > > continuous basis as part of our CI (e.g. once a day) and publish the
> > > > generated file to some well known location (e.g. public read-only S3
> > > > bucket).
> > > >
> > > > In fact, that was also the plan when we originally
> > > > added libcloud.pricing.download_pricing_file function and related
> > > > functionality quite a long time ago.
> > > >
> > > > IIRC, the plan was to include an auto-generated pricing file directly
> > > > inside the git repo, but this is more complicated and I would need to
> > > > contact the ASF infra team if they even allow something like that
> > > (updating
> > > > and committing a change as a bot user on our CI - Travis CI).
> > > >
> > > > So for now, I will probably just publish this auto-generated
> > pricing.json
> > > > file to a public read-only S3 bucket (I will make sure to set up
> > correct
> > > > rate limits and alerts to prevent abuse, even though the pricing file
> > > > itself is quite small).
> > > >
> > > > What do other people think?
> > >
> > >
> >
>

Re: [dev] Publishing up to date pricing data (pricing.json file) to a well known location

Posted by Samuel Marks <sa...@gmail.com>.
The other solution is to create a new git repository just for frequently
updated files like this one… I mean we don't want to end up like pytz do we?

PS: A good thing about pytz is other languages literally just parse pytz's
list for their own timezone implementation. No Python. Easy! - With this
being in JSON, I could imagine using Terraform libraries in Go; instead of
Libcloud; to do multicloud, and use this costing system to say where and
when.

Samuel Marks
Charity <https://sydneyscientific.org> | consultancy <https://offscale.io>
| open-source <https://github.com/offscale> | LinkedIn
<https://linkedin.com/in/samuelmarks>


On Thu, Jul 2, 2020 at 9:51 PM Jay Rolette <ro...@infinite.io> wrote:

> Same here!
>
> Thanks,
> Jay
>
> On Wed, Jul 1, 2020 at 12:45 PM Francisco Ros <fj...@doalitic.com> wrote:
>
> > Hey Tomaz,
> >
> > I'd really love to see this :-)
> >
> > Thanks,
> > Francisco
> >
> > > El 1 jul 2020, a las 12:00, Tomaz Muraus <to...@apache.org> escribió:
> > >
> > > Recently one of the Libcloud contributors (Eis-D-Z) published various
> > > improvements to our price scraping scripts and added some new ones -
> > > https://github.com/apache/libcloud/pulls/Eis-D-Z.
> > >
> > > I think it would now make sense to run those scraping scripts on a
> > > continuous basis as part of our CI (e.g. once a day) and publish the
> > > generated file to some well known location (e.g. public read-only S3
> > > bucket).
> > >
> > > In fact, that was also the plan when we originally
> > > added libcloud.pricing.download_pricing_file function and related
> > > functionality quite a long time ago.
> > >
> > > IIRC, the plan was to include an auto-generated pricing file directly
> > > inside the git repo, but this is more complicated and I would need to
> > > contact the ASF infra team if they even allow something like that
> > (updating
> > > and committing a change as a bot user on our CI - Travis CI).
> > >
> > > So for now, I will probably just publish this auto-generated
> pricing.json
> > > file to a public read-only S3 bucket (I will make sure to set up
> correct
> > > rate limits and alerts to prevent abuse, even though the pricing file
> > > itself is quite small).
> > >
> > > What do other people think?
> >
> >
>

Re: [dev] Publishing up to date pricing data (pricing.json file) to a well known location

Posted by Jay Rolette <ro...@infinite.io>.
Same here!

Thanks,
Jay

On Wed, Jul 1, 2020 at 12:45 PM Francisco Ros <fj...@doalitic.com> wrote:

> Hey Tomaz,
>
> I'd really love to see this :-)
>
> Thanks,
> Francisco
>
> > El 1 jul 2020, a las 12:00, Tomaz Muraus <to...@apache.org> escribió:
> >
> > Recently one of the Libcloud contributors (Eis-D-Z) published various
> > improvements to our price scraping scripts and added some new ones -
> > https://github.com/apache/libcloud/pulls/Eis-D-Z.
> >
> > I think it would now make sense to run those scraping scripts on a
> > continuous basis as part of our CI (e.g. once a day) and publish the
> > generated file to some well known location (e.g. public read-only S3
> > bucket).
> >
> > In fact, that was also the plan when we originally
> > added libcloud.pricing.download_pricing_file function and related
> > functionality quite a long time ago.
> >
> > IIRC, the plan was to include an auto-generated pricing file directly
> > inside the git repo, but this is more complicated and I would need to
> > contact the ASF infra team if they even allow something like that
> (updating
> > and committing a change as a bot user on our CI - Travis CI).
> >
> > So for now, I will probably just publish this auto-generated pricing.json
> > file to a public read-only S3 bucket (I will make sure to set up correct
> > rate limits and alerts to prevent abuse, even though the pricing file
> > itself is quite small).
> >
> > What do other people think?
>
>

Re: [dev] Publishing up to date pricing data (pricing.json file) to a well known location

Posted by Francisco Ros <fj...@doalitic.com>.
Hey Tomaz,

I'd really love to see this :-)

Thanks,
Francisco

> El 1 jul 2020, a las 12:00, Tomaz Muraus <to...@apache.org> escribió:
> 
> Recently one of the Libcloud contributors (Eis-D-Z) published various
> improvements to our price scraping scripts and added some new ones -
> https://github.com/apache/libcloud/pulls/Eis-D-Z.
> 
> I think it would now make sense to run those scraping scripts on a
> continuous basis as part of our CI (e.g. once a day) and publish the
> generated file to some well known location (e.g. public read-only S3
> bucket).
> 
> In fact, that was also the plan when we originally
> added libcloud.pricing.download_pricing_file function and related
> functionality quite a long time ago.
> 
> IIRC, the plan was to include an auto-generated pricing file directly
> inside the git repo, but this is more complicated and I would need to
> contact the ASF infra team if they even allow something like that (updating
> and committing a change as a bot user on our CI - Travis CI).
> 
> So for now, I will probably just publish this auto-generated pricing.json
> file to a public read-only S3 bucket (I will make sure to set up correct
> rate limits and alerts to prevent abuse, even though the pricing file
> itself is quite small).
> 
> What do other people think?