You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/01/15 09:07:55 UTC

[GitHub] [airflow] tandelDipak opened a new issue #13691: Update Connection Hook after starting EC2 instance

tandelDipak opened a new issue #13691:
URL: https://github.com/apache/airflow/issues/13691


   <!--
   
   Welcome to Apache Airflow!  For a smooth issue process, try to answer the following questions.
   Don't worry if they're not all applicable; just try to include what you can :-)
   
   If you need to include code snippets or logs, please put them in fenced code
   blocks.  If they're super-long, please use the details tag like
   <details><summary>super-long log</summary> lots of stuff </details>
   
   Please delete these comment blocks before submitting the issue.
   
   -->
   
   **Description**
   
   <!-- A short description of your feature -->
   Airflow version 2 has support for starting and stopping the ec2 instance but after we start the instance hostname will be defined dynamically so we cannot use the connection hook to ssh into the ec2 instance. We have to update the host parameter in the connection hook by some other means (Access the connection model and update host value). Can we add a feature in aws ec2 hook to update the connection hook when we start the server? This will make more sense in using ec2 operators.
   
   **Use case / motivation**
   
   <!-- What do you want to happen?
   
   Rather than telling us how you might implement this solution, try to take a
   step back and describe what you are trying to achieve.
   
   -->
   
   **Are you willing to submit a PR?**
   
   <!--- We accept contributions! -->
   Yes, I am ready to submit PR for it.
   
   
   **Related Issues**
   
   <!-- Is there currently another issue associated with this? -->
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] eladkal edited a comment on issue #13691: Update Connection Hook after starting EC2 instance

Posted by GitBox <gi...@apache.org>.
eladkal edited a comment on issue #13691:
URL: https://github.com/apache/airflow/issues/13691#issuecomment-848603746


   OK closing for now since this feature is not so clear to us thus no one else can pick it up.
   When you do find the time just open a draft PR when the suggestion you had in mind.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] eladkal commented on issue #13691: Update Connection Hook after starting EC2 instance

Posted by GitBox <gi...@apache.org>.
eladkal commented on issue #13691:
URL: https://github.com/apache/airflow/issues/13691#issuecomment-845431056


   @tandelDipak are you opening a PR?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] eladkal commented on issue #13691: Update Connection Hook after starting EC2 instance

Posted by GitBox <gi...@apache.org>.
eladkal commented on issue #13691:
URL: https://github.com/apache/airflow/issues/13691#issuecomment-768185540


   I think it would be easier if you can PR your code suggestion so we can review the approach you had in mind.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] eladkal commented on issue #13691: Update Connection Hook after starting EC2 instance

Posted by GitBox <gi...@apache.org>.
eladkal commented on issue #13691:
URL: https://github.com/apache/airflow/issues/13691#issuecomment-760926894


   Can you explain a bit further about your suggestion?
   
   I'll take example of my workflow with EMR (EC2 should be similar)
   I create the cluster (the machine ID is pushed to xcom)
   I submit jobs
   once jobs completed (or incompleted) I run a termination operator that pulls the ID from xcom and terminate the cluster.
   No machine is left idle.
   
   FTI @feluelle WDYT?
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] eladkal edited a comment on issue #13691: Update Connection Hook after starting EC2 instance

Posted by GitBox <gi...@apache.org>.
eladkal edited a comment on issue #13691:
URL: https://github.com/apache/airflow/issues/13691#issuecomment-760926894


   Can you explain a bit further about your suggestion?
   
   I'll take example of my workflow with EMR (EC2 should be similar)
   I create the cluster (the machine ID is pushed to xcom)
   I submit jobs
   once jobs completed (or incompleted) I run a termination operator that pulls the ID from xcom and terminate the cluster.
   No machine is left idle.
   
   FYI @feluelle WDYT?
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] feluelle commented on issue #13691: Update Connection Hook after starting EC2 instance

Posted by GitBox <gi...@apache.org>.
feluelle commented on issue #13691:
URL: https://github.com/apache/airflow/issues/13691#issuecomment-762048919


   @tandelDipak I also don't quite understand what you are trying to do. Can you elaborate your use-case, please?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] tandelDipak commented on issue #13691: Update Connection Hook after starting EC2 instance

Posted by GitBox <gi...@apache.org>.
tandelDipak commented on issue #13691:
URL: https://github.com/apache/airflow/issues/13691#issuecomment-768127146


   @eladkal Yes, I am suggesting something similar. 
   Let say you have EC2Hook already defined with the host value as **http://ec2-154-509-231-157.us-west-2.compute.amazonaws.com:8000/**  and I am not using static IP so now if I stop my instance and then start it I will get a new DNS hostname, but my EC2Hook has an old value of hostname  **http://ec2-154-509-231-157.us-west-2.compute.amazonaws.com:8000/**. So I have to update it every time after I start my instance. Once I update the host parameter of EC2Hook, I can use SSHOperator to run commands on a machine. If I don't update the value, SSHOperator will fail because EC2Hook is pointing to the old hostname which doesn't exist now.  
   
   This is not the case if you have static public IP and there is a limitation of static IP you can buy within a region.  
   
   So we can add an option inside the EC2StartInstanceOperator to changes the host value of the EC2Hook if the user wants. This will make EC2StartInstanceOperator, EC2StopInstanceOperator  more useable. Right now I am using my custom function which starts the EC2 instance  (almost similar to EC2StartInstanceOperator) and then updates the host value of EC2Hook if I pass the connection_id parameter.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on issue #13691: Update Connection Hook after starting EC2 instance

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #13691:
URL: https://github.com/apache/airflow/issues/13691#issuecomment-760768418


   Thanks for opening your first issue here! Be sure to follow the issue template!
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] tandelDipak commented on issue #13691: Update Connection Hook after starting EC2 instance

Posted by GitBox <gi...@apache.org>.
tandelDipak commented on issue #13691:
URL: https://github.com/apache/airflow/issues/13691#issuecomment-848526738


   @eladkal I haven't worked on it yet. I'm really sorry.  Will update you if I find time to work on it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] eladkal closed issue #13691: Update Connection Hook after starting EC2 instance

Posted by GitBox <gi...@apache.org>.
eladkal closed issue #13691:
URL: https://github.com/apache/airflow/issues/13691


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] eladkal commented on issue #13691: Update Connection Hook after starting EC2 instance

Posted by GitBox <gi...@apache.org>.
eladkal commented on issue #13691:
URL: https://github.com/apache/airflow/issues/13691#issuecomment-848603746


   OK closing for now.
   When you do find the time just open a draft PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] eladkal commented on issue #13691: Update Connection Hook after starting EC2 instance

Posted by GitBox <gi...@apache.org>.
eladkal commented on issue #13691:
URL: https://github.com/apache/airflow/issues/13691#issuecomment-768047415


   @tandelDipak So you are using [EC2StartInstanceOperator](https://github.com/apache/airflow/blob/7f4c88c0680b4fb98fe8b31800a93e1d0476c4db/airflow/providers/amazon/aws/operators/ec2_start_instance.py#L27), [EC2StopInstanceOperator](https://github.com/apache/airflow/blob/7f4c88c0680b4fb98fe8b31800a93e1d0476c4db/airflow/providers/amazon/aws/operators/ec2_stop_instance.py#L27) to interact with the machine.
   And you are missing a way to get the machine host?
   
   So your request is to add a method to `EC2Hook` will return the machine host?
   Something like https://stackoverflow.com/questions/34728477/retrieving-public-dns-of-ec2-instance-with-boto3 
   https://maskaravivek.medium.com/how-to-ssh-into-an-ec2-instance-using-boto3-a138a4345a91
   ?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] feluelle commented on issue #13691: Update Connection Hook after starting EC2 instance

Posted by GitBox <gi...@apache.org>.
feluelle commented on issue #13691:
URL: https://github.com/apache/airflow/issues/13691#issuecomment-771003561


   > So we can add an option inside the EC2StartInstanceOperator to changes the host value of the EC2Hook if the user wants. This will make EC2StartInstanceOperator, EC2StopInstanceOperator more useable. Right now I am using my custom function which starts the EC2 instance (almost similar to EC2StartInstanceOperator) and then updates the host value of EC2Hook if I pass the connection_id parameter.
   
   Or return the host name to xcom? We currently do not return anything and returning the hostname makes sense imo. What do you think?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] tandelDipak commented on issue #13691: Update Connection Hook after starting EC2 instance

Posted by GitBox <gi...@apache.org>.
tandelDipak commented on issue #13691:
URL: https://github.com/apache/airflow/issues/13691#issuecomment-762061432


   @feluelle @eladkal  thank you for your efforts. Let me explain the use case in detail.
   Let's say I want to run some task or simple ls command on ec2 instance. I don't keep my ec2 instance always on, I turn it on when needed. So, to run a bash script or some other command first I have to start the instance. Then I have to ssh into the ec2 instance and pass the command as an argument in SSH Operator. After the work is done. I will shut down the ec2 machine.  Now SSH Operator uses connection hook, which takes hostname parameter. (To ssh into the system I need the hostname of the ec2 instance). For AWS ec2 instance hostname is derived dynamically when I start the instance.  So the connection hook has to be updated with the new hostname before using it in SSH Operator. I hope it's bit clear now. 
   
   Let me know if you need more details.  I appreciate your support.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] eladkal edited a comment on issue #13691: Update Connection Hook after starting EC2 instance

Posted by GitBox <gi...@apache.org>.
eladkal edited a comment on issue #13691:
URL: https://github.com/apache/airflow/issues/13691#issuecomment-768047415


   @tandelDipak So you are using [EC2StartInstanceOperator](https://github.com/apache/airflow/blob/7f4c88c0680b4fb98fe8b31800a93e1d0476c4db/airflow/providers/amazon/aws/operators/ec2_start_instance.py#L27), [EC2StopInstanceOperator](https://github.com/apache/airflow/blob/7f4c88c0680b4fb98fe8b31800a93e1d0476c4db/airflow/providers/amazon/aws/operators/ec2_stop_instance.py#L27) to interact with the machine.
   And you are missing a way to get the machine host?
   
   So your request is to add a method to `EC2Hook` will return the machine host?
   Something like https://stackoverflow.com/questions/34728477/retrieving-public-dns-of-ec2-instance-with-boto3 
   https://maskaravivek.medium.com/how-to-ssh-into-an-ec2-instance-using-boto3-a138a4345a91
   ?
   
   From what I can tell the `public_ip_address` is part of the Instance so `get_instance` already provides this.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org