You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/10/17 14:47:51 UTC
[GitHub] [airflow] jnunezgts opened a new issue #11617: Docker Swarm operator is broken: json: cannot unmarshal string into Go struct field Resources.MemoryBytes of type int64
jnunezgts opened a new issue #11617:
URL: https://github.com/apache/airflow/issues/11617
<!--
Welcome to Apache Airflow! For a smooth issue process, try to answer the following questions.
Don't worry if they're not all applicable; just try to include what you can :-)
If you need to include code snippets or logs, please put them in fenced code
blocks. If they're super-long, please use the details tag like
<details><summary>super-long log</summary> lots of stuff </details>
Please delete these comment blocks before submitting the issue.
-->
**Apache Airflow version**:
1.10.12, using SQLLite as the backend
**Kubernetes version (if you are using kubernetes)** (use `kubectl version`):
N/A. Using Docker Swarm 19.03.8
**Environment**:
- **Cloud provider or hardware configuration**:
No cloud, bare-metal server:
```
HP ProLiant DL560 Gen8, BIOS P77 12/20/2013, 64 cpus
```
- **OS** (e.g. from /etc/os-release):
```
Fedora release 29 (Twenty Nine)
```
- **Kernel** (e.g. `uname -a`):
```
Linux server.company.com 4.19.82-1300.fc29.x86_64 #1 SMP Fri Nov 8 10:49:58 EST 2019 x86_64 x86_64 x86_64 GNU/Linux
```
- **Install tools**:
```
pip
```
- **Others**:
```
Python 3.7.2 (default, Jan 16 2019, 19:49:22)
[GCC 8.2.1 20181215 (Red Hat 8.2.1-6)] on linux
```
Docker info:
```
Client:
Debug Mode: false
Server:
Containers: 21
Running: 0
Paused: 0
Stopped: 21
Images: 12
Server Version: 19.03.8
Storage Driver: overlay2
Backing Filesystem: <unknown>
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: systemd
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: active
NodeID: j0pl320hoxuqcaaa14z2znvgo
Is Manager: true
ClusterID: kpgz783mpw8aapdxchtwdu2ff
Managers: 1
Nodes: 4
Default Address Pool: 10.0.0.0/8
SubnetSize: 24
Data Path Port: 4789
Orchestration:
Task History Retention Limit: 5
Raft:
Snapshot Interval: 10000
Number of Old Snapshots to Retain: 0
Heartbeat Tick: 1
Election Tick: 10
Dispatcher:
Heartbeat Period: 5 seconds
CA Configuration:
Expiry Duration: 3 months
Force Rotate: 0
Autolock Managers: false
Root Rotation In Progress: false
Node Address: 172.29.248.55
Manager Addresses:
172.29.248.55:2377
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 7ad184331fa3e55e52b890ea95e65ba581ae3429
runc version: dc9208a3303feef5b3839f4323d9beb36df0a9dd
init version: fec3683
Security Options:
seccomp
Profile: default
Kernel Version: 4.19.82-1300.fc29.x86_64
Operating System: Fedora 29 (Twenty Nine)
OSType: linux
Architecture: x86_64
CPUs: 64
Total Memory: 125.9GiB
Name: server.company.com
ID: 7ESU:O253:JGNS:YJIY:XXX:CYTI:WFQC:6L5C:XXXX:62IO:VH23:XXXX
Docker Root Dir: /opt/docker
Debug Mode: false
HTTP Proxy: http://proxy.company.com:8080/
HTTPS Proxy: http://proxy.company:8080/
No Proxy: localhost,127.0.0.1,server.company.com,.company.com
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
privatereg.company.com:5000
localhost:5000
server.company.com:5000
127.0.0.0/8
Live Restore Enabled: false
```
**What happened**:
Created the following DAG to schedule a one time shot job:
```
from datetime import time
from datetime import datetime
from datetime import timedelta
from airflow import DAG
from airflow.contrib.operators.docker_swarm_operator import DockerSwarmOperator
DEFAULT_ARGS = {
'retry_delay': timedelta(minutes=5),
'retries': 1,
'email_on_failure': True,
'email_on_retry': False,
'email': ['myemail@company.com']
}
with DAG('24_7_box', description='24 x 7. With retries', default_args=DEFAULT_ARGS, schedule_interval='0 * * * Mon-Sun', start_date=datetime(2019, 7, 23), max_active_runs=1, catchup=False) as twenty_four_by_seven_dag:
# See:
# https://airflow.apache.org/docs/stable/_api/airflow/contrib/operators/docker_swarm_operator/index.html
# https://airflow.apache.org/docs/stable/_modules/airflow/contrib/operators/docker_swarm_operator.html
SLEEP_TASK = DockerSwarmOperator(
task_id="SLEEP_TASK",
image="fedora:29",
api_version="auto",
command="/bin/sleep 60",
docker_url="unix://var/run/docker-sysavtbuild.sock",
force_pull=False,
mem_limit="500m",
auto_remove=True,
)
SLEEP_TASK
```
**What you expected to happen**:
I was expecting the container to be created and be alive for 60 seconds, exit with code=0 after that. No ouput.
Other have reported success in the past using [Docker Swarm Operator](https://airflow.apache.org/docs/stable/_api/airflow/contrib/operators/docker_swarm_operator/).
<!-- What do you think went wrong? -->
Not sure. The Airflow log shows the following:
```
[2020-10-17 09:46:58,475] {taskinstance.py:1150} ERROR - 400 Client Error: Bad Request ("json: cannot unmarshal string into Go struct field Resources.MemoryBytes of type int64")
```
I can run this command from docker CLI as follows:
```
[user@server dags]$ docker run --rm --detach fedora:29 /bin/sleep 45
29912c34f43e2dfa20d417cb80113059a183518b99215609c0aa7b37874c27db
[user@server dags]$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
29912c34f43e fedora:29 "/bin/sleep 45" 7 seconds ago Up 6 seconds gifted_pare
```
**How to reproduce it**:
1. Copy the DAG provided into ~/airflow/dags
2. Turn off the DAG
3. Trigger the DAG or let the scheduler run it. Error will show up
**Anything else we need to know**:
<details><summary>Airflow.log</summary>
*** Reading local file: /home/user/airflow/logs/avt_24_7_box/SLEEP_TASK/2020-10-17T13:24:26.101897+00:00/2.log
[2020-10-17 09:46:58,312] {taskinstance.py:670} INFO - Dependencies all met for <TaskInstance: avt_24_7_box.SLEEP_TASK 2020-10-17T13:24:26.101897+00:00 [queued]>
[2020-10-17 09:46:58,321] {taskinstance.py:670} INFO - Dependencies all met for <TaskInstance: avt_24_7_box.SLEEP_TASK 2020-10-17T13:24:26.101897+00:00 [queued]>
[2020-10-17 09:46:58,321] {taskinstance.py:880} INFO -
--------------------------------------------------------------------------------
[2020-10-17 09:46:58,321] {taskinstance.py:881} INFO - Starting attempt 2 of 2
[2020-10-17 09:46:58,321] {taskinstance.py:882} INFO -
--------------------------------------------------------------------------------
[2020-10-17 09:46:58,328] {taskinstance.py:901} INFO - Executing <Task(DockerSwarmOperator): SLEEP_TASK> on 2020-10-17T13:24:26.101897+00:00
[2020-10-17 09:46:58,335] {standard_task_runner.py:54} INFO - Started process 35637 to run task
[2020-10-17 09:46:58,371] {standard_task_runner.py:77} INFO - Running: ['airflow', 'run', '24_7_box', 'SLEEP_TASK', '2020-10-17T13:24:26.101897+00:00', '--job_id', '55', '--pool', 'default_pool', '--raw', '-sd', 'DAGS_FOLDER/avt_test3.py', '--cfg_path', '/tmp/tmpaivrdhuu']
[2020-10-17 09:46:58,372] {standard_task_runner.py:78} INFO - Job 55: Subtask SLEEP_TASK
[2020-10-17 09:46:58,398] {logging_mixin.py:112} INFO - Running %s on host %s <TaskInstance: avt_24_7_box.SLEEP_TASK 2020-10-17T13:24:26.101897+00:00 [running]> server.company.com
[2020-10-17 09:46:58,467] {docker_swarm_operator.py:105} INFO - Starting docker service from image fedora:29
[2020-10-17 09:46:58,475] {taskinstance.py:1150} ERROR - 400 Client Error: Bad Request ("json: cannot unmarshal string into Go struct field Resources.MemoryBytes of type int64")
Traceback (most recent call last):
File "/home/user/virtualenv/airflow/lib64/python3.7/site-packages/docker/api/client.py", line 259, in _raise_for_status
response.raise_for_status()
File "/home/user/virtualenv/airflow/lib64/python3.7/site-packages/requests/models.py", line 941, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: http+docker://localhost/v1.40/services/create
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/user/virtualenv/airflow/lib64/python3.7/site-packages/airflow/models/taskinstance.py", line 984, in _run_raw_task
result = task_copy.execute(context=context)
File "/home/user/virtualenv/airflow/lib64/python3.7/site-packages/airflow/operators/docker_operator.py", line 277, in execute
return self._run_image()
File "/home/user/virtualenv/airflow/lib64/python3.7/site-packages/airflow/contrib/operators/docker_swarm_operator.py", line 119, in _run_image
labels={'name': 'airflow__%s__%s' % (self.dag_id, self.task_id)}
File "/home/user/virtualenv/airflow/lib64/python3.7/site-packages/docker/utils/decorators.py", line 34, in wrapper
return f(self, *args, **kwargs)
File "/home/user/virtualenv/airflow/lib64/python3.7/site-packages/docker/api/service.py", line 190, in create_service
self._post_json(url, data=data, headers=headers), True
File "/home/user/virtualenv/airflow/lib64/python3.7/site-packages/docker/api/client.py", line 265, in _result
self._raise_for_status(response)
File "/home/user/virtualenv/airflow/lib64/python3.7/site-packages/docker/api/client.py", line 261, in _raise_for_status
raise create_api_error_from_http_exception(e)
File "/home/user/virtualenv/airflow/lib64/python3.7/site-packages/docker/errors.py", line 31, in create_api_error_from_http_exception
raise cls(e, response=response, explanation=explanation)
docker.errors.APIError: 400 Client Error: Bad Request ("json: cannot unmarshal string into Go struct field Resources.MemoryBytes of type int64")
[2020-10-17 09:46:58,481] {taskinstance.py:1194} INFO - Marking task as FAILED. dag_id=24_7_box, task_id=SLEEP_TASK, execution_date=20201017T132426, start_date=20201017T134658, end_date=20201017T134658
[2020-10-17 09:46:58,509] {configuration.py:373} WARNING - section/key [smtp/smtp_user] not found in config
[2020-10-17 09:46:58,583] {email.py:132} INFO - Sent an alert email to ['user@company.com']
[2020-10-17 09:47:03,312] {local_task_job.py:102} INFO - Task exited with return code 1
</details>
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] boring-cyborg[bot] commented on issue #11617: Docker Swarm operator is broken: json: cannot unmarshal string into Go struct field Resources.MemoryBytes of type int64
Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #11617:
URL: https://github.com/apache/airflow/issues/11617#issuecomment-711023327
Thanks for opening your first issue here! Be sure to follow the issue template!
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] mik-laj commented on issue #11617: Docker Swarm operator is broken: json: cannot unmarshal string into Go struct field Resources.MemoryBytes of type int64
Posted by GitBox <gi...@apache.org>.
mik-laj commented on issue #11617:
URL: https://github.com/apache/airflow/issues/11617#issuecomment-712134943
@jnunezgts Can I assign you to this report? I am happy to help with the review.
CC: @nullhack
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] mik-laj commented on issue #11617: Docker Swarm operator is broken: json: cannot unmarshal string into Go struct field Resources.MemoryBytes of type int64
Posted by GitBox <gi...@apache.org>.
mik-laj commented on issue #11617:
URL: https://github.com/apache/airflow/issues/11617#issuecomment-712149329
Don't worry. Be happy. We will definitely help you. We also have a special channel for new contributors - #airflow-how-to-pr ([![Slack Status](https://img.shields.io/badge/slack-join_chat-white.svg?logo=slack&style=social)](https://s.apache.org/airflow-slack)
)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] jnunezgts commented on issue #11617: Docker Swarm operator is broken: json: cannot unmarshal string into Go struct field Resources.MemoryBytes of type int64
Posted by GitBox <gi...@apache.org>.
jnunezgts commented on issue #11617:
URL: https://github.com/apache/airflow/issues/11617#issuecomment-711142545
Hello,
I think I found the bug. If I remove completely the memory limit parameter from the DAG ```mem_limit="500m"```, the task runs without a problem. The documentation says that you can put a float value in bytes or a String that will get parsed correctly (like the Docker operator) but that's not the case here
```
:param mem_limit: Maximum amount of memory the container can use.
Either a float value, which represents the limit in bytes,
or a string like ``128m`` or ``1g``.
:type mem_limit: float or str
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] jnunezgts commented on issue #11617: Docker Swarm operator is broken: json: cannot unmarshal string into Go struct field Resources.MemoryBytes of type int64
Posted by GitBox <gi...@apache.org>.
jnunezgts commented on issue #11617:
URL: https://github.com/apache/airflow/issues/11617#issuecomment-712140669
> @jnunezgts Can I assign you to this report? I am happy to help with the review.
> CC: @nullhack
Hello,
That works. You may need to do a little bit off hand holding as I haven't submitted a patch to an OpenSource project in a while. Also I need to debug this to see why the DockerOperator is happy with the flag but the SwamDockerOperator is not (I suspect is how they pass the values between them but it is a wild guess at this point).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org