You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/08/25 08:51:56 UTC

[GitHub] [airflow] hankehly opened a new issue, #25952: Add RDS operators/sensors

hankehly opened a new issue, #25952:
URL: https://github.com/apache/airflow/issues/25952

   ### Description
   
   I think adding the following operators/sensors would benefit companies that need to start/stop RDS instances programmatically.
   
   Name | Description
   :- | :-
   `RdsStartDbOperator` | Start an instance, and optionally wait for it enter "available" state
   `RdsStopDbOperator` | Start an instance, and optionally wait for it to enter "stopped" state
   `RdsStatusSensor` | Wait for the requested status (eg. available, stopped)
   
   Is this something that would be accepted into the codebase?
   Please let me know.
   
   ### Use case/motivation
   
   #### 1. Saving money
   
   RDS is expensive. To save money, a company keeps test/dev environment relational databases shutdown until it needs to use them. With Airflow, they can start a database instance before running a workload, then turn it off after the workload finishes (or errors).
   
   #### 2. Force RDS to stay shutdown
   
   RDS automatically starts a database after 1 week of downtime. A company does not need this feature. They can create a DAG to continuously run the shutdown command on a list of databases instance ids stored in a `Variable`. The alternative is to create a shell script or login to the console and manually shutdown each database every week.
   
   #### 3. Making sure a database is running before scheduling workload
   
   A company programmatically starts/stops its RDS instances. Before they run a workload, they want to make sure it's running. They can use a sensor to make sure a database is available before attempting to run any jobs that require access.
   
   ### Related issues
   
   _No response_
   
   ### Are you willing to submit a PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] hankehly commented on issue #25952: Add RDS operators/sensors

Posted by GitBox <gi...@apache.org>.
hankehly commented on issue #25952:
URL: https://github.com/apache/airflow/issues/25952#issuecomment-1275461650

   @o-nikolas Yes, I apologize for the delay. I have an implementation ready, just need to add tests. I'll have a PR open this weekend.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] o-nikolas commented on issue #25952: Add RDS operators/sensors

Posted by GitBox <gi...@apache.org>.
o-nikolas commented on issue #25952:
URL: https://github.com/apache/airflow/issues/25952#issuecomment-1275463032

   > @o-nikolas Yes, I apologize for the delay. I have an implementation ready, just need to add tests. I'll have a PR open this weekend.
   
   No worries at all! Thanks for sticking with it :smile: 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] hankehly commented on issue #25952: Add RDS operators/sensors

Posted by GitBox <gi...@apache.org>.
hankehly commented on issue #25952:
URL: https://github.com/apache/airflow/issues/25952#issuecomment-1236443373

   @kazanzhy Yes that sounds good to me. In that case, how about we rename the `RdsInstanceSensor` to `RdsDbSensor`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] vincbeck commented on issue #25952: Add RDS operators/sensors

Posted by GitBox <gi...@apache.org>.
vincbeck commented on issue #25952:
URL: https://github.com/apache/airflow/issues/25952#issuecomment-1238646029

   I am not strongly against this solution but I do not see any real value compare to what we what today. I understand the intent of de duplicating the if statement in `_describe_item` in both operator and sensor but on the other side you also add complexity by adding these new "resources". In my opinion, that would make things even more complicated to anyone who want to work on these operators/sensors.
   
   That's only my humble opinion :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] hankehly commented on issue #25952: Add RDS operators/sensors

Posted by GitBox <gi...@apache.org>.
hankehly commented on issue #25952:
URL: https://github.com/apache/airflow/issues/25952#issuecomment-1237585545

   cc @dstandish @o-nikolas @ferruzzi @kazanzhy 
   
   I'd like someone's opinion please.
   
   While editing at the operator/sensor code, I'm noticing a couple things:
   * some duplication in [`RdsBaseSensor`](https://github.com/apache/airflow/blob/main/airflow/providers/amazon/aws/sensors/rds.py) / [`RdsBaseOperator`](https://github.com/apache/airflow/blob/main/airflow/providers/amazon/aws/operators/rds.py) around the `_describe_item` method
   * base classes are responsible for details of subclasses
   
   The current implementation is a great start, but I think we could improve the design to reduce duplication. How about creating (private) reusable models of each RDS resource (ie db instance, cluster, snapshot, etc..) like in the following example?
   
   ```py
   # 1. Create a shared model class for RdsInstance
   class RdsInstance:
       def __init__(self, hook, id):
           self.hook = hook
           self.id = id
   
       def check_status(self, target_statuses: List[str]) -> bool:
           instances = self.hook.conn.describe_db_instances(DBInstanceIdentifier=self.id)
           return instances[0]["DbInstanceStatus"] in target_statuses
   
       def start(self):
           pass # todo
   
       def stop(self):
           pass # todo
   
   
   # 2. Create a shared model class for RdsCluster
   class RdsCluster:
       def __init__(self, hook, id):
           self.hook = hook
           self.id = id
   
       def check_status(self, target_statuses: List[str]) -> bool:
           instances = self.hook.conn.describe_db_clusters(DBClusterIdentifier=self.id)
           return instances[0]["Status"] in target_statuses
   
   
   # 3. Use the models in sensor classes (reduce code duplication)
   class RdsDbSensor(RdsBaseSensor):
       def _poke(self):
           return self._resource.check_status(self.target_statuses)
   
       @cached_property
       def _resource(self):
           if self.db_type == "instance":
               return RdbInstance(self.hook, self.db_identifier)
           return RdbCluster(self.hook, self.db_identifier)
   
   
   # 4. Use the models in sensor classes (reduce code duplication)
   class RdsStartDbOperator(RdsBaseOperator):
       def _execute(self):
           self._resource.start()
   
       @cached_property
       def _resource(self):
           if self.db_type == "instance":
               return RdbInstance(self.hook, self.db_identifier)
           return RdbCluster(self.hook, self.db_identifier)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] vincbeck commented on issue #25952: Add RDS operators/sensors

Posted by GitBox <gi...@apache.org>.
vincbeck commented on issue #25952:
URL: https://github.com/apache/airflow/issues/25952#issuecomment-1227377970

   These operators do not exist and indeed, would be beneficial for the users. Feel free to add them in `airflow/providers/amazon/aws/operators/rds.py`. My only guidance would be to follow other RDS operators and to include documentation, unit tests and add them in the system test. I look forward to reviewing that


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] kazanzhy commented on issue #25952: Add RDS operators/sensors

Posted by GitBox <gi...@apache.org>.
kazanzhy commented on issue #25952:
URL: https://github.com/apache/airflow/issues/25952#issuecomment-1236418195

   It will be great to keep consistency with these PRs:
   https://github.com/apache/airflow/pull/21231
   https://github.com/apache/airflow/pull/20907
   
   I mean the `db_type` in `RdsStartDbOperator`, which can be either `instance'` or `'cluster'`. 
   WDYT, @hankehly?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on issue #25952: Add RDS operators/sensors

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #25952:
URL: https://github.com/apache/airflow/issues/25952#issuecomment-1227194257

   Sure - go ahead - just make sure there are no similar operators there already,. And @o-nikolas @ferruzzi @vincbeck might provide their guidance too :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] eladkal closed issue #25952: Add RDS operators/sensors

Posted by GitBox <gi...@apache.org>.
eladkal closed issue #25952: Add RDS operators/sensors
URL: https://github.com/apache/airflow/issues/25952


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] o-nikolas commented on issue #25952: Add RDS operators/sensors

Posted by GitBox <gi...@apache.org>.
o-nikolas commented on issue #25952:
URL: https://github.com/apache/airflow/issues/25952#issuecomment-1275460452

   Hey @hankehly,
   
   I see that the sensor PR was merged. Are you still working on the Start/Stop operators?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] hankehly commented on issue #25952: Add RDS operators/sensors

Posted by GitBox <gi...@apache.org>.
hankehly commented on issue #25952:
URL: https://github.com/apache/airflow/issues/25952#issuecomment-1240031197

   Thanks @vincbeck for the feedback. I'll stick to the way it's implemented.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] o-nikolas commented on issue #25952: Add RDS operators/sensors

Posted by GitBox <gi...@apache.org>.
o-nikolas commented on issue #25952:
URL: https://github.com/apache/airflow/issues/25952#issuecomment-1227753625

   +1 from me as well!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] hankehly commented on issue #25952: Add RDS operators/sensors

Posted by GitBox <gi...@apache.org>.
hankehly commented on issue #25952:
URL: https://github.com/apache/airflow/issues/25952#issuecomment-1229118531

   Great, thanks for the feedback. I'll submit some PRs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org