You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/04/28 20:07:53 UTC

[GitHub] [airflow] CodingJonas opened a new pull request #8617: Extend docker swarm operator configuration

CodingJonas opened a new pull request #8617:
URL: https://github.com/apache/airflow/pull/8617


   While working with the operator I had to change a few things to work better for my use cases.
   I thought I'd make a pull request, perhaps some changes seem relevant.
   I'm happy to cherry pick code changes that makes sense for the project. Here is a list of the changes:
   - Additional options to define configs, secrets and networks for the new service to use
   - The service name now contains the Airflow `task_id` to easier distinguish between multiple services started through Airflow
   - Short sleeping intervals between service status polls to reduce cpu usage
   - Minor refactoring to improve readability of the `_run_service()` method
   
   ---
   - [X] Description above provides context of the change
   - [X] Unit tests coverage for changes (not needed for documentation changes)
   - [X] Target Github ISSUE in description if exists
   - [X] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)"
   - [X] Relevant documentation is updated including usage instructions.
   - [X] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #8617: Extend docker swarm operator configuration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #8617:
URL: https://github.com/apache/airflow/pull/8617#discussion_r436339070



##########
File path: airflow/providers/docker/operators/docker_swarm.py
##########
@@ -93,20 +95,37 @@ class DockerSwarmOperator(DockerOperator):
         Supported only if the Docker engine is using json-file or journald logging drivers.
         The `tty` parameter should be set to use this with Python applications.
     :type enable_logging: bool
+    :param configs: List of ConfigReferences that will be exposed to the service
+    :type configs: list
+       Example: [{'source': my-conf, 'target': where/config/should/be/.env}]
+    :param secrets: List of SecretReference to be made available inside the containers
+    :type secrets: list
+    :param networks: List of network names or IDs to attach the service to
+    :type networks: list
     """
 
     @apply_defaults
     def __init__(
             self,
             image,
             enable_logging=True,
+            networks=None,
+            configs=None,
+            secrets=None,
             *args,
             **kwargs):
         super().__init__(image=image, *args, **kwargs)
 
         self.enable_logging = enable_logging
+        self.networks = networks
+        self.configs = configs
+        self.secrets = secrets
+        self.service_name = self._get_service_name()
         self.service = None
 
+    def _get_service_name(self):
+        return '%s__af_%s' % (self.task_id, get_random_string())

Review comment:
       Is there any service_name length limit?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] CodingJonas commented on pull request #8617: Extend docker swarm operator configuration

Posted by GitBox <gi...@apache.org>.
CodingJonas commented on pull request #8617:
URL: https://github.com/apache/airflow/pull/8617#issuecomment-642002414


   @nullhack I have no dedicated test dag as I didn't use a dedicated test environment (I copied the code changes from a different, existing environment), but this should work:
   
   ### Start docker swarm with network and config
   ```bash
   docker swarm init
   docker network create -d overlay --attachable private-network
   echo "example config content" | docker config create the-config -
   
   # This service does nothing but exist to be pinged at
   docker service create -d --replicas 1 --network=private-network --name server_ping_me          alpine tail -f /dev/null
   ```
   ### Example DAG
   The task in the DAG should be able to successfully ping the other service.
   ```python
   dag_params = dict(
       dag_id='full_docker_swarm_test',
       schedule_interval=None,
       start_date=datetime(2020, 3, 20),
       catchup=False,
   )
   
   with DAG(**dag_params) as dag:
       task = FullDockerSwarmOperator(
           task_id='ping_inside_server',
           image='alpine',
           api_version='auto',
           networks=['private-network'],
           configs=[{'source': 'the-config', 'target': 'the-config'}],
           command='ping -c 4 server_ping_me',
           tty=True,
       )
   
       task
   ```
   I haven't tested the example script in this comment.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on pull request #8617: Extend docker swarm operator configuration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on pull request #8617:
URL: https://github.com/apache/airflow/pull/8617#issuecomment-640217113


   @nullhack Can I ask for a review?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] nullhack commented on a change in pull request #8617: Extend docker swarm operator configuration

Posted by GitBox <gi...@apache.org>.
nullhack commented on a change in pull request #8617:
URL: https://github.com/apache/airflow/pull/8617#discussion_r436528891



##########
File path: airflow/providers/docker/operators/docker_swarm.py
##########
@@ -16,6 +16,8 @@
 # under the License.
 """Run ephemeral Docker Swarm services"""
 
+from time import sleep
+

Review comment:
       As a suggestion, running `isort` on imports is a good practice.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] enima2684 commented on pull request #8617: Extend docker swarm operator configuration

Posted by GitBox <gi...@apache.org>.
enima2684 commented on pull request #8617:
URL: https://github.com/apache/airflow/pull/8617#issuecomment-656215476


   Many thanks for this addition ! I am looking forward to use this feature :)
   I can help if anything is blocking the merge of this PR, just let me know


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] stale[bot] closed pull request #8617: Extend docker swarm operator configuration

Posted by GitBox <gi...@apache.org>.
stale[bot] closed pull request #8617:
URL: https://github.com/apache/airflow/pull/8617


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] stale[bot] commented on pull request #8617: Extend docker swarm operator configuration

Posted by GitBox <gi...@apache.org>.
stale[bot] commented on pull request #8617:
URL: https://github.com/apache/airflow/pull/8617#issuecomment-683255363


   This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] CodingJonas commented on pull request #8617: Extend docker swarm operator configuration

Posted by GitBox <gi...@apache.org>.
CodingJonas commented on pull request #8617:
URL: https://github.com/apache/airflow/pull/8617#issuecomment-641981402


   @nullhack @mik-laj I updated the PR, split the vague commit into multiple smaller ones, and responded to your comments. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] CodingJonas commented on a change in pull request #8617: Extend docker swarm operator configuration

Posted by GitBox <gi...@apache.org>.
CodingJonas commented on a change in pull request #8617:
URL: https://github.com/apache/airflow/pull/8617#discussion_r438081599



##########
File path: airflow/providers/docker/operators/docker_swarm.py
##########
@@ -93,20 +95,37 @@ class DockerSwarmOperator(DockerOperator):
         Supported only if the Docker engine is using json-file or journald logging drivers.
         The `tty` parameter should be set to use this with Python applications.
     :type enable_logging: bool
+    :param configs: List of ConfigReferences that will be exposed to the service
+    :type configs: list
+       Example: [{'source': my-conf, 'target': where/config/should/be/.env}]
+    :param secrets: List of SecretReference to be made available inside the containers
+    :type secrets: list
+    :param networks: List of network names or IDs to attach the service to
+    :type networks: list
     """
 
     @apply_defaults
     def __init__(
             self,
             image,
             enable_logging=True,
+            networks=None,
+            configs=None,
+            secrets=None,
             *args,
             **kwargs):
         super().__init__(image=image, *args, **kwargs)
 
         self.enable_logging = enable_logging
+        self.networks = networks
+        self.configs = configs
+        self.secrets = secrets
+        self.service_name = self._get_service_name()

Review comment:
       Updated this. Since `_get_service_name()` is only called twice, I removed the member variable and call the function every time we need the name.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #8617: Extend docker swarm operator configuration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #8617:
URL: https://github.com/apache/airflow/pull/8617#discussion_r436339000



##########
File path: airflow/providers/docker/operators/docker_swarm.py
##########
@@ -93,20 +95,37 @@ class DockerSwarmOperator(DockerOperator):
         Supported only if the Docker engine is using json-file or journald logging drivers.
         The `tty` parameter should be set to use this with Python applications.
     :type enable_logging: bool
+    :param configs: List of ConfigReferences that will be exposed to the service
+    :type configs: list
+       Example: [{'source': my-conf, 'target': where/config/should/be/.env}]
+    :param secrets: List of SecretReference to be made available inside the containers
+    :type secrets: list
+    :param networks: List of network names or IDs to attach the service to
+    :type networks: list
     """
 
     @apply_defaults
     def __init__(
             self,
             image,
             enable_logging=True,
+            networks=None,
+            configs=None,
+            secrets=None,
             *args,
             **kwargs):
         super().__init__(image=image, *args, **kwargs)
 
         self.enable_logging = enable_logging
+        self.networks = networks
+        self.configs = configs
+        self.secrets = secrets
+        self.service_name = self._get_service_name()

Review comment:
       We should not initialize this field in the constructor because the constructor is called each time the DAG is loaded.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] CodingJonas removed a comment on pull request #8617: Extend docker swarm operator configuration

Posted by GitBox <gi...@apache.org>.
CodingJonas removed a comment on pull request #8617:
URL: https://github.com/apache/airflow/pull/8617#issuecomment-623339870


   The Pipeline fails on an unrelated test:
   ```
    >       self.assertEqual(State.RUNNING, ti.state)
   E       AssertionError: 'running' != 'success'
   E       - running
   E       + success
   
   tests/jobs/test_local_task_job.py:400: AssertionError
   ```
   In case this is delaying the review, can I just rerun this step of the pipeline?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] CodingJonas commented on a change in pull request #8617: Extend docker swarm operator configuration

Posted by GitBox <gi...@apache.org>.
CodingJonas commented on a change in pull request #8617:
URL: https://github.com/apache/airflow/pull/8617#discussion_r437686189



##########
File path: airflow/providers/docker/operators/docker_swarm.py
##########
@@ -93,20 +95,37 @@ class DockerSwarmOperator(DockerOperator):
         Supported only if the Docker engine is using json-file or journald logging drivers.
         The `tty` parameter should be set to use this with Python applications.
     :type enable_logging: bool
+    :param configs: List of ConfigReferences that will be exposed to the service
+    :type configs: list
+       Example: [{'source': my-conf, 'target': where/config/should/be/.env}]
+    :param secrets: List of SecretReference to be made available inside the containers
+    :type secrets: list
+    :param networks: List of network names or IDs to attach the service to
+    :type networks: list
     """
 
     @apply_defaults
     def __init__(
             self,
             image,
             enable_logging=True,
+            networks=None,
+            configs=None,
+            secrets=None,
             *args,
             **kwargs):
         super().__init__(image=image, *args, **kwargs)
 
         self.enable_logging = enable_logging
+        self.networks = networks
+        self.configs = configs
+        self.secrets = secrets
+        self.service_name = self._get_service_name()
         self.service = None
 
+    def _get_service_name(self):
+        return '%s__af_%s' % (self.task_id, get_random_string())

Review comment:
       Good point, I looked into it but didn't find any limitations.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] CodingJonas commented on pull request #8617: Extend docker swarm operator configuration

Posted by GitBox <gi...@apache.org>.
CodingJonas commented on pull request #8617:
URL: https://github.com/apache/airflow/pull/8617#issuecomment-623339870


   The Pipeline fails on an unrelated test:
   ```
    >       self.assertEqual(State.RUNNING, ti.state)
   E       AssertionError: 'running' != 'success'
   E       - running
   E       + success
   
   tests/jobs/test_local_task_job.py:400: AssertionError
   ```
   In case this is delaying the review, can I just rerun this step of the pipeline?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] CodingJonas commented on a change in pull request #8617: Extend docker swarm operator configuration

Posted by GitBox <gi...@apache.org>.
CodingJonas commented on a change in pull request #8617:
URL: https://github.com/apache/airflow/pull/8617#discussion_r437686189



##########
File path: airflow/providers/docker/operators/docker_swarm.py
##########
@@ -93,20 +95,37 @@ class DockerSwarmOperator(DockerOperator):
         Supported only if the Docker engine is using json-file or journald logging drivers.
         The `tty` parameter should be set to use this with Python applications.
     :type enable_logging: bool
+    :param configs: List of ConfigReferences that will be exposed to the service
+    :type configs: list
+       Example: [{'source': my-conf, 'target': where/config/should/be/.env}]
+    :param secrets: List of SecretReference to be made available inside the containers
+    :type secrets: list
+    :param networks: List of network names or IDs to attach the service to
+    :type networks: list
     """
 
     @apply_defaults
     def __init__(
             self,
             image,
             enable_logging=True,
+            networks=None,
+            configs=None,
+            secrets=None,
             *args,
             **kwargs):
         super().__init__(image=image, *args, **kwargs)
 
         self.enable_logging = enable_logging
+        self.networks = networks
+        self.configs = configs
+        self.secrets = secrets
+        self.service_name = self._get_service_name()
         self.service = None
 
+    def _get_service_name(self):
+        return '%s__af_%s' % (self.task_id, get_random_string())

Review comment:
       Good point, I looked into it but didn't find any mentions of length limitations.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] nullhack commented on pull request #8617: Extend docker swarm operator configuration

Posted by GitBox <gi...@apache.org>.
nullhack commented on pull request #8617:
URL: https://github.com/apache/airflow/pull/8617#issuecomment-642074003


   @CodingJonas Thank you, I'll test over the weekend. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] CodingJonas commented on pull request #8617: Extend docker swarm operator configuration

Posted by GitBox <gi...@apache.org>.
CodingJonas commented on pull request #8617:
URL: https://github.com/apache/airflow/pull/8617#issuecomment-641541320


   Yes, I guess I should have structured this PR more in the first place, I see that I split the commit into better reviewable commits in the next days!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on pull request #8617: Extend docker swarm operator configuration

Posted by GitBox <gi...@apache.org>.
mik-laj commented on pull request #8617:
URL: https://github.com/apache/airflow/pull/8617#issuecomment-640177774


   Your changes look interesting, but it's hard to read. Could you divide these changes into several smaller ones? Reviewers will also be more active if they can quickly review these changes and don't have to focus too much.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] nullhack commented on pull request #8617: Extend docker swarm operator configuration

Posted by GitBox <gi...@apache.org>.
nullhack commented on pull request #8617:
URL: https://github.com/apache/airflow/pull/8617#issuecomment-640235542


   > @nullhack Can I ask for a review?
   
   Sure, I'll do soon. 
   
   I like this PR, `configs, secrets and networks` are nice adds.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] nullhack edited a comment on pull request #8617: Extend docker swarm operator configuration

Posted by GitBox <gi...@apache.org>.
nullhack edited a comment on pull request #8617:
URL: https://github.com/apache/airflow/pull/8617#issuecomment-642074003


   @CodingJonas Thank you, I'll try to test it.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] CodingJonas edited a comment on pull request #8617: Extend docker swarm operator configuration

Posted by GitBox <gi...@apache.org>.
CodingJonas edited a comment on pull request #8617:
URL: https://github.com/apache/airflow/pull/8617#issuecomment-642002414


   @nullhack I have no dedicated test dag as I didn't use a dedicated test environment (I copied the code changes from a different, existing environment), but this should work:
   
   ### Start docker swarm with network and config
   ```bash
   docker swarm init
   docker network create -d overlay --attachable private-network
   echo "example config content" | docker config create the-config -
   
   # This service does nothing but exist to be pinged at
   docker service create -d --replicas 1 --network=private-network --name server_ping_me          alpine tail -f /dev/null
   ```
   ### Example DAG
   The task in the DAG should be able to successfully ping the other service.
   ```python
   dag_params = dict(
       dag_id='full_docker_swarm_test',
       schedule_interval=None,
       start_date=datetime(2020, 3, 20),
       catchup=False,
   )
   
   with DAG(**dag_params) as dag:
       task = DockerSwarmOperator(
           task_id='ping_inside_server',
           image='alpine',
           api_version='auto',
           networks=['private-network'],
           configs=[{'source': 'the-config', 'target': 'the-config'}],
           command='ping -c 4 server_ping_me',
           tty=True,
       )
   
       task
   ```
   I haven't tested the example script in this comment.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] nullhack commented on pull request #8617: Extend docker swarm operator configuration

Posted by GitBox <gi...@apache.org>.
nullhack commented on pull request #8617:
URL: https://github.com/apache/airflow/pull/8617#issuecomment-657182692


   Hi @CodingJonas sorry for the delay, I've been busy past weekends, I did not manage to fully test It yet. I've made three attempts of running so far, but I'm having issues with my environment (not related to your code). I need to fix my machine before I can proceed.
   
   I'll try to finish by next weekend. But in meantime if someone else can review feel free to do so.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] CodingJonas commented on a change in pull request #8617: Extend docker swarm operator configuration

Posted by GitBox <gi...@apache.org>.
CodingJonas commented on a change in pull request #8617:
URL: https://github.com/apache/airflow/pull/8617#discussion_r438082229



##########
File path: airflow/providers/docker/operators/docker_swarm.py
##########
@@ -16,6 +16,8 @@
 # under the License.
 """Run ephemeral Docker Swarm services"""
 
+from time import sleep
+

Review comment:
       Thanks for the hint, `isort` seems really nice. In my case it did not change the import structure. Does Airflow use some custom config or rules for it?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] CodingJonas commented on pull request #8617: Extend docker swarm operator configuration

Posted by GitBox <gi...@apache.org>.
CodingJonas commented on pull request #8617:
URL: https://github.com/apache/airflow/pull/8617#issuecomment-656514822


   @nullhack @mik-laj, do you have time for a review soon? @enima2684, if you find any issues in the code, I'm always happy about feedback!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj edited a comment on pull request #8617: Extend docker swarm operator configuration

Posted by GitBox <gi...@apache.org>.
mik-laj edited a comment on pull request #8617:
URL: https://github.com/apache/airflow/pull/8617#issuecomment-640177774


   Your changes look interesting, but it's hard to review. Could you divide these changes into several smaller ones? Reviewers will also be more active if they can quickly review these changes and don't have to focus too much.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org