You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@openwhisk.apache.org by ap...@cbickel.de on 2017/04/10 12:55:54 UTC

Save activations in a new activations DB

Hi OpenWhisk developers,

some weeks ago, I started working on some database performance and one item affects the whisks-database.
All actions, rules, triggers, packages and activations are stored in this database. Because of all the activations, this database becomes really huge over time.
To keep the database with the artifacts, created by the users (actions, ...), small we had the idea to put all activations into another database. The activations-db.

To solve the problem of the migration of existing OpenWhisk deployments, my proposal is to make the split configurable with ansible.
PR 2123 (https://github.com/openwhisk/openwhisk/pull/2123) creates this flag. By default, the whisks database will be used to store activations.
After some time, I'll open another PR, that changes the default to use the activations-db.
Again, after some time, I'll open a third PR to remove this flag again. By default, all activations will go into the activations-db.
The reason for removing this flag again is, that it will be easier to maintain only one configuration: the configuration to use two seperate dbs.
And we think, that all owners of OpenWhisk deployments agree that it would be good to split configuration data of the user (actions, ...) and logs (activations, ...).

If you need to migrate your deployment, it would work with the steps on the following document:
https://gist.github.com/cbickel/37e651965781b27de245eac0ce831a53 (https://gist.github.com/cbickel/37e651965781b27de245eac0ce831a53)

Does anyone have any concerns about this change or any comments?

Greetings Christian

Re: Save activations in a new activations DB

Posted by ap...@cbickel.de.
Hi OpenWhisk team,

as I have written in the mails before, I'd like to remove the possibility to save activations in the same database like the actions.
This will be done with the following PR: https://github.com/openwhisk/openwhisk/pull/2196

I suggest to merge this PR by the end of the week.

If you have not yet set the variable "db_split_actions_and_activations" to true, you have to migrate your database, before the flag is removed.
If you have an development environment, where you can loose all your data, you can use the wipe.yml ansible-playbook. If you want to keep all your data and just split the databases, you can use the following migration instructions: https://gist.github.com/cbickel/37e651965781b27de245eac0ce831a53

If you need more time or have any concerns just answer to this mail. Otherwise the PR will be merged by the end of the week.

Greeting Christian


April 27, 2017 1:31 PM, apache@cbickel.de wrote:

> Hi OpenWhisk developers,
> 
> there are again some news on separating the activations into a new database.
> The next Pull request (https://github.com/openwhisk/openwhisk/pull/2134) will be merged soon. After
> this Pull request is merged, the flag (db_split_actions_and_activations) to enable or disable the
> separation of the activations is required to deploy OpenWhisk.
> It has to be set in your environemnts group_vars/all file.
> Additionally, this PR will change the default of the 3 environments (local, mac, distributed) to
> use the additional activations database for all activations.
> If you want to have new activations still in your whisks database, just set this flag to "false" in
> your environment. But keep in mind, that the ability to use the whisks database for activations
> will go away soon.
> 
> If you want to keep all activations and want to keep them accessible for the users of your
> deployment, you have to migrate the activations into the new activations database. The steps to do
> this are described in: https://gist.github.com/cbickel/37e651965781b27de245eac0ce831a53
> If it is okay for your environment to loose all data stored in the database, the databases will be
> regenerated with the right structure with the "wipe.yml" playbook.
> 
> My next PR will remove the ability to use the whisks database for activations.
> 
> Greetings
> Christian
> 
> April 16, 2017 1:17 AM, "Rodric Rabbah" <ro...@gmail.com> wrote:
> 
>> I've seen a proposal and initial prototype from Jeremias which tees logs to an external drain. The
>> idea is to make this plugable in the way you might be thinking but I'll let him share the details
>> and thoughts and you can evaluate and we can iterate.
>> 
>> I don't think we want to give up the concurrent draining of logs to Couch where they are available
>> nearly instantaneously as part of the core system. This is superior to other systems including
>> Lambda for example (a common complaint is that logs take a long time to be available). I view this
>> as essential to a fast debug cycle while developing.
>> 
>> That said, it is clear that we need to drain logs to a system that will allow more general
>> indexing, searching, measuring, and analyzing logs - and in past two weeks this has come up
>> repeatedly in our public slack channel. This in my view should be done as a plugable external layer
>> rather than part of the core API. This isn't to say the core API can't provide some more views that
>> can be generally useful.
>> 
>> -r
> 
> On Apr 14, 2017, at 7:55 PM, Michael Marth <mm...@adobe.com> wrote:
> 
> Hi Christian,
> 
> Sorry to chime in late - I was out.
> Recently, I had also been thinking about splitting more static configure data (like the action)
> from highly transactional data like the activations. My reason was, however, to have an easier way
> forward to multi-datacenter deployments.
> 
> Regardless of the motivation for the split into activation-db and "static-db" I would like to make
> this comment:
> Now, that we split out the activation-db into its own separate API we should take the opportunity
> to design this API (interface) that truly allows pluggable implementations. I am particularly
> interested in an Elastic Search impl for the activation-db (as I believe that ES lends itself well
> to activations-workloads). There might be other interesting impls. Point being: let's iterate a bit
> over the interface to make sure it can be implemented in non-CouchDB deployments. I am convinced
> this will be helpful in the future in order to evolve OW.
> In the light of the above it would be useful to design the activation-db interface with a mindset
> that does not assume that activation-db and static-db are the same physical storage or even the
> same storage technology. Mentioning this to avoid assumptions about ID-semantics, joins, etc.
> 
> Wdyt?
> Michael
> 
> Sent from a mobile device
> _____________________________
> From: apache@cbickel.de<ma...@cbickel.de>
> Sent: Wednesday, April 12, 2017 1:37 PM
> Subject: Re: Save activations in a new activations DB
> To: <de...@openwhisk.apache.org>>
> 
> Hi OpenWhisk developers,
> 
> the first Pull Request I mentioned in the last mail
> (https://github.com/openwhisk/openwhisk/pull/2123) is merged now.
> If you use one of the three environments in open, the default will still be, that activations are
> saved in the whisks-db. But now you can set that flag to true.
> The next Pull request is opened now: https://github.com/openwhisk/openwhisk/pull/2134
> It sets the default of the three environments to use the activations-db. It also requires this
> variable in your environment. It still can be set to true or false.
> 
> If you have some OpenWhisk deployments, I'd suggest to set the variable to false, do your migration
> and set it to true afterwards.
> The migration is described here: https://gist.github.com/cbickel/37e651965781b27de245eac0ce831a53
> 
> My next, and last, PR will be to remove the ability to use that flag.
> 
> If you have any problems with your migration or concerns just reach out to me.
> 
> Greetings Christian
> 
> April 10, 2017 2:56 PM, apache@cbickel.de<ma...@cbickel.de> wrote:
>> Hi OpenWhisk developers,
>> 
>> some weeks ago, I started working on some database performance and one item affects the
>> whisks-database.
>> All actions, rules, triggers, packages and activations are stored in this database. Because of all
>> the activations, this database becomes really huge over time.
>> To keep the database with the artifacts, created by the users (actions, ...), small we had the idea
>> to put all activations into another database. The activations-db.
>> 
>> To solve the problem of the migration of existing OpenWhisk deployments, my proposal is to make the
>> split configurable with ansible.
>> PR 2123 (https://github.com/openwhisk/openwhisk/pull/2123) creates this flag. By default, the
>> whisks database will be used to store activations.
>> After some time, I'll open another PR, that changes the default to use the activations-db.
>> Again, after some time, I'll open a third PR to remove this flag again. By default, all activations
>> will go into the activations-db.
>> The reason for removing this flag again is, that it will be easier to maintain only one
>> configuration: the configuration to use two seperate dbs.
>> And we think, that all owners of OpenWhisk deployments agree that it would be good to split
>> configuration data of the user (actions, ...) and logs (activations, ...).
>> 
>> If you need to migrate your deployment, it would work with the steps on the following document:
>> https://gist.github.com/cbickel/37e651965781b27de245eac0ce831a53
>> (https://gist.github.com/cbickel/37e651965781b27de245eac0ce831a53)
>> 
>> Does anyone have any concerns about this change or any comments?
>> 
>> Greetings Christian

Re: Save activations in a new activations DB

Posted by ap...@cbickel.de.
Hi OpenWhisk developers,

there are again some news on separating the activations into a new database.
The next Pull request (https://github.com/openwhisk/openwhisk/pull/2134) will be merged soon. After
this Pull request is merged, the flag (db_split_actions_and_activations) to enable or disable the
separation of the activations is required to deploy OpenWhisk.
It has to be set in your environemnts group_vars/all file.
Additionally, this PR will change the default of the 3 environments (local, mac, distributed) to
use the additional activations database for all activations.
If you want to have new activations still in your whisks database, just set this flag to "false" in
your environment. But keep in mind, that the ability to use the whisks database for activations
will go away soon.

If you want to keep all activations and want to keep them accessible for the users of your
deployment, you have to migrate the activations into the new activations database. The steps to do
this are described in: https://gist.github.com/cbickel/37e651965781b27de245eac0ce831a53
If it is okay for your environment to loose all data stored in the database, the databases will be
regenerated with the right structure with the "wipe.yml" playbook.

My next PR will remove the ability to use the whisks database for activations.

Greetings
Christian

April 16, 2017 1:17 AM, "Rodric Rabbah" <ro...@gmail.com> wrote:

> I've seen a proposal and initial prototype from Jeremias which tees logs to an external drain. The
> idea is to make this plugable in the way you might be thinking but I'll let him share the details
> and thoughts and you can evaluate and we can iterate.
> 
> I don't think we want to give up the concurrent draining of logs to Couch where they are available
> nearly instantaneously as part of the core system. This is superior to other systems including
> Lambda for example (a common complaint is that logs take a long time to be available). I view this
> as essential to a fast debug cycle while developing.
> 
> That said, it is clear that we need to drain logs to a system that will allow more general
> indexing, searching, measuring, and analyzing logs - and in past two weeks this has come up
> repeatedly in our public slack channel. This in my view should be done as a plugable external layer
> rather than part of the core API. This isn't to say the core API can't provide some more views that
> can be generally useful.
> 
> -r
> 
>> On Apr 14, 2017, at 7:55 PM, Michael Marth <mm...@adobe.com> wrote:
>> 
>> Hi Christian,
>> 
>> Sorry to chime in late - I was out.
>> Recently, I had also been thinking about splitting more static configure data (like the action)
>> from highly transactional data like the activations. My reason was, however, to have an easier way
>> forward to multi-datacenter deployments.
>> 
>> Regardless of the motivation for the split into activation-db and "static-db" I would like to make
>> this comment:
>> Now, that we split out the activation-db into its own separate API we should take the opportunity
>> to design this API (interface) that truly allows pluggable implementations. I am particularly
>> interested in an Elastic Search impl for the activation-db (as I believe that ES lends itself well
>> to activations-workloads). There might be other interesting impls. Point being: let's iterate a bit
>> over the interface to make sure it can be implemented in non-CouchDB deployments. I am convinced
>> this will be helpful in the future in order to evolve OW.
>> In the light of the above it would be useful to design the activation-db interface with a mindset
>> that does not assume that activation-db and static-db are the same physical storage or even the
>> same storage technology. Mentioning this to avoid assumptions about ID-semantics, joins, etc.
>> 
>> Wdyt?
>> Michael
>> 
>> Sent from a mobile device
>> _____________________________
>> From: apache@cbickel.de<ma...@cbickel.de>
>> Sent: Wednesday, April 12, 2017 1:37 PM
>> Subject: Re: Save activations in a new activations DB
>> To: <de...@openwhisk.apache.org>>
>> 
>> Hi OpenWhisk developers,
>> 
>> the first Pull Request I mentioned in the last mail
>> (https://github.com/openwhisk/openwhisk/pull/2123) is merged now.
>> If you use one of the three environments in open, the default will still be, that activations are
>> saved in the whisks-db. But now you can set that flag to true.
>> The next Pull request is opened now: https://github.com/openwhisk/openwhisk/pull/2134
>> It sets the default of the three environments to use the activations-db. It also requires this
>> variable in your environment. It still can be set to true or false.
>> 
>> If you have some OpenWhisk deployments, I'd suggest to set the variable to false, do your migration
>> and set it to true afterwards.
>> The migration is described here: https://gist.github.com/cbickel/37e651965781b27de245eac0ce831a53
>> 
>> My next, and last, PR will be to remove the ability to use that flag.
>> 
>> If you have any problems with your migration or concerns just reach out to me.
>> 
>> Greetings Christian
>> 
>> April 10, 2017 2:56 PM, apache@cbickel.de<ma...@cbickel.de> wrote:
> 
> Hi OpenWhisk developers,
> 
> some weeks ago, I started working on some database performance and one item affects the
> whisks-database.
> All actions, rules, triggers, packages and activations are stored in this database. Because of all
> the activations, this database becomes really huge over time.
> To keep the database with the artifacts, created by the users (actions, ...), small we had the idea
> to put all activations into another database. The activations-db.
> 
> To solve the problem of the migration of existing OpenWhisk deployments, my proposal is to make the
> split configurable with ansible.
> PR 2123 (https://github.com/openwhisk/openwhisk/pull/2123) creates this flag. By default, the
> whisks database will be used to store activations.
> After some time, I'll open another PR, that changes the default to use the activations-db.
> Again, after some time, I'll open a third PR to remove this flag again. By default, all activations
> will go into the activations-db.
> The reason for removing this flag again is, that it will be easier to maintain only one
> configuration: the configuration to use two seperate dbs.
> And we think, that all owners of OpenWhisk deployments agree that it would be good to split
> configuration data of the user (actions, ...) and logs (activations, ...).
> 
> If you need to migrate your deployment, it would work with the steps on the following document:
> https://gist.github.com/cbickel/37e651965781b27de245eac0ce831a53
> (https://gist.github.com/cbickel/37e651965781b27de245eac0ce831a53)
> 
> Does anyone have any concerns about this change or any comments?
> 
> Greetings Christian

Re: Save activations in a new activations DB

Posted by Rodric Rabbah <ro...@gmail.com>.
I've seen a proposal and initial prototype from Jeremias which tees logs to an external drain. The idea is to make this plugable in the way you might be thinking but I'll let him share the details and thoughts and you can evaluate and we can iterate.

I don't think we want to give up the concurrent draining of logs to Couch where they are available nearly instantaneously as part of the core system. This is superior to other systems including Lambda for example (a common complaint is that logs take a long time to be available). I view this as essential to a fast debug cycle while developing.

That said, it is clear that we need to drain logs to a system that will allow more general indexing, searching, measuring, and analyzing logs - and in past two weeks this has come up repeatedly in our public slack channel. This in my view should be done as a plugable external layer rather than part of the core API. This isn't to say the core API can't provide some more views that can be generally useful. 

-r

> On Apr 14, 2017, at 7:55 PM, Michael Marth <mm...@adobe.com> wrote:
> 
> Hi Christian,
> 
> Sorry to chime in late - I was out.
> Recently, I had also been thinking about splitting more static configure data (like the action) from highly transactional data like the activations. My reason was, however, to have an easier  way forward to multi-datacenter deployments.
> 
> Regardless of the motivation for the split into activation-db and "static-db" I would like to make this comment:
> Now, that we split out the activation-db into its own separate API we should take the opportunity to design this API (interface) that truly allows pluggable implementations. I am particularly interested in an Elastic Search impl for the activation-db (as I believe that ES lends itself well to activations-workloads). There might be other interesting impls. Point being: let's iterate a bit over the interface to make sure it can be implemented in non-CouchDB deployments. I am convinced this will be helpful in the future in order to evolve OW.
> In the light of the above it would be useful to design the activation-db interface with a mindset that does not assume that activation-db and static-db are the same physical storage or even the same storage technology. Mentioning this to avoid assumptions about ID-semantics, joins, etc.
> 
> Wdyt?
> Michael
> 
> Sent from a mobile device
> _____________________________
> From: apache@cbickel.de<ma...@cbickel.de>
> Sent: Wednesday, April 12, 2017 1:37 PM
> Subject: Re: Save activations in a new activations DB
> To: <de...@openwhisk.apache.org>>
> 
> 
> Hi OpenWhisk developers,
> 
> the first Pull Request I mentioned in the last mail (https://github.com/openwhisk/openwhisk/pull/2123) is merged now.
> If you use one of the three environments in open, the default will still be, that activations are saved in the whisks-db. But now you can set that flag to true.
> The next Pull request is opened now: https://github.com/openwhisk/openwhisk/pull/2134
> It sets the default of the three environments to use the activations-db. It also requires this variable in your environment. It still can be set to true or false.
> 
> If you have some OpenWhisk deployments, I'd suggest to set the variable to false, do your migration and set it to true afterwards.
> The migration is described here: https://gist.github.com/cbickel/37e651965781b27de245eac0ce831a53
> 
> My next, and last, PR will be to remove the ability to use that flag.
> 
> If you have any problems with your migration or concerns just reach out to me.
> 
> Greetings Christian
> 
> 
> April 10, 2017 2:56 PM, apache@cbickel.de<ma...@cbickel.de> wrote:
> 
>> Hi OpenWhisk developers,
>> 
>> some weeks ago, I started working on some database performance and one item affects the
>> whisks-database.
>> All actions, rules, triggers, packages and activations are stored in this database. Because of all
>> the activations, this database becomes really huge over time.
>> To keep the database with the artifacts, created by the users (actions, ...), small we had the idea
>> to put all activations into another database. The activations-db.
>> 
>> To solve the problem of the migration of existing OpenWhisk deployments, my proposal is to make the
>> split configurable with ansible.
>> PR 2123 (https://github.com/openwhisk/openwhisk/pull/2123) creates this flag. By default, the
>> whisks database will be used to store activations.
>> After some time, I'll open another PR, that changes the default to use the activations-db.
>> Again, after some time, I'll open a third PR to remove this flag again. By default, all activations
>> will go into the activations-db.
>> The reason for removing this flag again is, that it will be easier to maintain only one
>> configuration: the configuration to use two seperate dbs.
>> And we think, that all owners of OpenWhisk deployments agree that it would be good to split
>> configuration data of the user (actions, ...) and logs (activations, ...).
>> 
>> If you need to migrate your deployment, it would work with the steps on the following document:
>> https://gist.github.com/cbickel/37e651965781b27de245eac0ce831a53
>> (https://gist.github.com/cbickel/37e651965781b27de245eac0ce831a53)
>> 
>> Does anyone have any concerns about this change or any comments?
>> 
>> Greetings Christian
> 
> 

Re: Save activations in a new activations DB

Posted by Michael Marth <mm...@adobe.com>.
Hi Christian,

Sorry to chime in late - I was out.
Recently, I had also been thinking about splitting more static configure data (like the action) from highly transactional data like the activations. My reason was, however, to have an easier  way forward to multi-datacenter deployments.

Regardless of the motivation for the split into activation-db and "static-db" I would like to make this comment:
Now, that we split out the activation-db into its own separate API we should take the opportunity to design this API (interface) that truly allows pluggable implementations. I am particularly interested in an Elastic Search impl for the activation-db (as I believe that ES lends itself well to activations-workloads). There might be other interesting impls. Point being: let's iterate a bit over the interface to make sure it can be implemented in non-CouchDB deployments. I am convinced this will be helpful in the future in order to evolve OW.
In the light of the above it would be useful to design the activation-db interface with a mindset that does not assume that activation-db and static-db are the same physical storage or even the same storage technology. Mentioning this to avoid assumptions about ID-semantics, joins, etc.

Wdyt?
Michael

Sent from a mobile device
_____________________________
From: apache@cbickel.de<ma...@cbickel.de>
Sent: Wednesday, April 12, 2017 1:37 PM
Subject: Re: Save activations in a new activations DB
To: <de...@openwhisk.apache.org>>


Hi OpenWhisk developers,

the first Pull Request I mentioned in the last mail (https://github.com/openwhisk/openwhisk/pull/2123) is merged now.
If you use one of the three environments in open, the default will still be, that activations are saved in the whisks-db. But now you can set that flag to true.
The next Pull request is opened now: https://github.com/openwhisk/openwhisk/pull/2134
It sets the default of the three environments to use the activations-db. It also requires this variable in your environment. It still can be set to true or false.

If you have some OpenWhisk deployments, I'd suggest to set the variable to false, do your migration and set it to true afterwards.
The migration is described here: https://gist.github.com/cbickel/37e651965781b27de245eac0ce831a53

My next, and last, PR will be to remove the ability to use that flag.

If you have any problems with your migration or concerns just reach out to me.

Greetings Christian


April 10, 2017 2:56 PM, apache@cbickel.de<ma...@cbickel.de> wrote:

> Hi OpenWhisk developers,
>
> some weeks ago, I started working on some database performance and one item affects the
> whisks-database.
> All actions, rules, triggers, packages and activations are stored in this database. Because of all
> the activations, this database becomes really huge over time.
> To keep the database with the artifacts, created by the users (actions, ...), small we had the idea
> to put all activations into another database. The activations-db.
>
> To solve the problem of the migration of existing OpenWhisk deployments, my proposal is to make the
> split configurable with ansible.
> PR 2123 (https://github.com/openwhisk/openwhisk/pull/2123) creates this flag. By default, the
> whisks database will be used to store activations.
> After some time, I'll open another PR, that changes the default to use the activations-db.
> Again, after some time, I'll open a third PR to remove this flag again. By default, all activations
> will go into the activations-db.
> The reason for removing this flag again is, that it will be easier to maintain only one
> configuration: the configuration to use two seperate dbs.
> And we think, that all owners of OpenWhisk deployments agree that it would be good to split
> configuration data of the user (actions, ...) and logs (activations, ...).
>
> If you need to migrate your deployment, it would work with the steps on the following document:
> https://gist.github.com/cbickel/37e651965781b27de245eac0ce831a53
> (https://gist.github.com/cbickel/37e651965781b27de245eac0ce831a53)
>
> Does anyone have any concerns about this change or any comments?
>
> Greetings Christian



Re: Save activations in a new activations DB

Posted by ap...@cbickel.de.
Hi OpenWhisk developers,

the first Pull Request I mentioned in the last mail (https://github.com/openwhisk/openwhisk/pull/2123) is merged now.
If you use one of the three environments in open, the default will still be, that activations are saved in the whisks-db. But now you can set that flag to true.
The next Pull request is opened now: https://github.com/openwhisk/openwhisk/pull/2134
It sets the default of the three environments to use the activations-db. It also requires this variable in your environment. It still can be set to true or false.

If you have some OpenWhisk deployments, I'd suggest to set the variable to false, do your migration and set it to true afterwards.
The migration is described here: https://gist.github.com/cbickel/37e651965781b27de245eac0ce831a53

My next, and last, PR will be to remove the ability to use that flag.

If you have any problems with your migration or concerns just reach out to me.

Greetings Christian
 

April 10, 2017 2:56 PM, apache@cbickel.de wrote:

> Hi OpenWhisk developers,
> 
> some weeks ago, I started working on some database performance and one item affects the
> whisks-database.
> All actions, rules, triggers, packages and activations are stored in this database. Because of all
> the activations, this database becomes really huge over time.
> To keep the database with the artifacts, created by the users (actions, ...), small we had the idea
> to put all activations into another database. The activations-db.
> 
> To solve the problem of the migration of existing OpenWhisk deployments, my proposal is to make the
> split configurable with ansible.
> PR 2123 (https://github.com/openwhisk/openwhisk/pull/2123) creates this flag. By default, the
> whisks database will be used to store activations.
> After some time, I'll open another PR, that changes the default to use the activations-db.
> Again, after some time, I'll open a third PR to remove this flag again. By default, all activations
> will go into the activations-db.
> The reason for removing this flag again is, that it will be easier to maintain only one
> configuration: the configuration to use two seperate dbs.
> And we think, that all owners of OpenWhisk deployments agree that it would be good to split
> configuration data of the user (actions, ...) and logs (activations, ...).
> 
> If you need to migrate your deployment, it would work with the steps on the following document:
> https://gist.github.com/cbickel/37e651965781b27de245eac0ce831a53
> (https://gist.github.com/cbickel/37e651965781b27de245eac0ce831a53)
> 
> Does anyone have any concerns about this change or any comments?
> 
> Greetings Christian