You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/12/13 19:41:18 UTC

[GitHub] [airflow] dstandish opened a new issue #13046: installation of simplejson breaks airlfow webserver 2.0.0RC2

dstandish opened a new issue #13046:
URL: https://github.com/apache/airflow/issues/13046


   To reproduce:
   1. `pip install apache-airflow==2.0.0RC2`
   2. `pip install simplejson`
   3. run webserver
   4. open in browser and observe following error
   
   Error:
   ```
   [2020-12-13 11:37:28 -0800] [85061] [INFO] Starting gunicorn 19.10.0
   [2020-12-13 11:37:28 -0800] [85061] [INFO] Listening at: http://0.0.0.0:8080 (85061)
   [2020-12-13 11:37:28 -0800] [85061] [INFO] Using worker: sync
   [2020-12-13 11:37:28 -0800] [85064] [INFO] Booting worker with pid: 85064
   [2020-12-13 11:37:36,444] {app.py:1892} ERROR - Exception on /home [GET]
   Traceback (most recent call last):
     File "/Users/dstandish/.pyenv/versions/3.7.5/lib/python3.7/site-packages/flask/app.py", line 2446, in wsgi_app
       ctx.push()
     File "/Users/dstandish/.pyenv/versions/3.7.5/lib/python3.7/site-packages/flask/ctx.py", line 390, in push
       self.session = session_interface.open_session(self.app, self.request)
     File "/Users/dstandish/.pyenv/versions/3.7.5/lib/python3.7/site-packages/flask/sessions.py", line 340, in open_session
       s = self.get_signing_serializer(app)
     File "/Users/dstandish/.pyenv/versions/3.7.5/lib/python3.7/site-packages/flask/sessions.py", line 336, in get_signing_serializer
       signer_kwargs=signer_kwargs,
     File "/Users/dstandish/.pyenv/versions/3.7.5/lib/python3.7/site-packages/itsdangerous/serializer.py", line 95, in __init__
       self.is_text_serializer = is_text_serializer(serializer)
     File "/Users/dstandish/.pyenv/versions/3.7.5/lib/python3.7/site-packages/itsdangerous/serializer.py", line 13, in is_text_serializer
       return isinstance(serializer.dumps({}), text_type)
     File "/Users/dstandish/.pyenv/versions/3.7.5/lib/python3.7/site-packages/flask/json/tag.py", line 305, in dumps
       return dumps(self.tag(value), separators=(",", ":"))
     File "/Users/dstandish/.pyenv/versions/3.7.5/lib/python3.7/site-packages/flask/json/__init__.py", line 211, in dumps
       rv = _json.dumps(obj, **kwargs)
     File "/Users/dstandish/.pyenv/versions/3.7.5/lib/python3.7/site-packages/simplejson/__init__.py", line 412, in dumps
       **kw).encode(obj)
     File "/Users/dstandish/.pyenv/versions/3.7.5/lib/python3.7/site-packages/airflow/utils/json.py", line 36, in __init__
       super().__init__(*args, **kwargs)
   TypeError: __init__() got an unexpected keyword argument 'encoding'
   [2020-12-13 11:37:36 -0800] [85064] [ERROR] Error handling request /home
   Traceback (most recent call last):
     File "/Users/dstandish/.pyenv/versions/3.7.5/lib/python3.7/site-packages/flask/app.py", line 2446, in wsgi_app
       ctx.push()
     File "/Users/dstandish/.pyenv/versions/3.7.5/lib/python3.7/site-packages/flask/ctx.py", line 390, in push
       self.session = session_interface.open_session(self.app, self.request)
     File "/Users/dstandish/.pyenv/versions/3.7.5/lib/python3.7/site-packages/flask/sessions.py", line 340, in open_session
       s = self.get_signing_serializer(app)
     File "/Users/dstandish/.pyenv/versions/3.7.5/lib/python3.7/site-packages/flask/sessions.py", line 336, in get_signing_serializer
       signer_kwargs=signer_kwargs,
     File "/Users/dstandish/.pyenv/versions/3.7.5/lib/python3.7/site-packages/itsdangerous/serializer.py", line 95, in __init__
       self.is_text_serializer = is_text_serializer(serializer)
     File "/Users/dstandish/.pyenv/versions/3.7.5/lib/python3.7/site-packages/itsdangerous/serializer.py", line 13, in is_text_serializer
       return isinstance(serializer.dumps({}), text_type)
     File "/Users/dstandish/.pyenv/versions/3.7.5/lib/python3.7/site-packages/flask/json/tag.py", line 305, in dumps
       return dumps(self.tag(value), separators=(",", ":"))
     File "/Users/dstandish/.pyenv/versions/3.7.5/lib/python3.7/site-packages/flask/json/__init__.py", line 211, in dumps
       rv = _json.dumps(obj, **kwargs)
     File "/Users/dstandish/.pyenv/versions/3.7.5/lib/python3.7/site-packages/simplejson/__init__.py", line 412, in dumps
       **kw).encode(obj)
     File "/Users/dstandish/.pyenv/versions/3.7.5/lib/python3.7/site-packages/airflow/utils/json.py", line 36, in __init__
       super().__init__(*args, **kwargs)
   TypeError: __init__() got an unexpected keyword argument 'encoding'
   
   During handling of the above exception, another exception occurred:
   
   Traceback (most recent call last):
     File "/Users/dstandish/.pyenv/versions/3.7.5/lib/python3.7/site-packages/gunicorn/workers/sync.py", line 135, in handle
       self.handle_request(listener, req, client, addr)
     File "/Users/dstandish/.pyenv/versions/3.7.5/lib/python3.7/site-packages/gunicorn/workers/sync.py", line 176, in handle_request
       respiter = self.wsgi(environ, resp.start_response)
     File "/Users/dstandish/.pyenv/versions/3.7.5/lib/python3.7/site-packages/flask/app.py", line 2464, in __call__
       return self.wsgi_app(environ, start_response)
     File "/Users/dstandish/.pyenv/versions/3.7.5/lib/python3.7/site-packages/flask/app.py", line 2450, in wsgi_app
       response = self.handle_exception(e)
     File "/Users/dstandish/.pyenv/versions/3.7.5/lib/python3.7/site-packages/flask/app.py", line 1879, in handle_exception
       server_error = handler(server_error)
     File "/Users/dstandish/.pyenv/versions/3.7.5/lib/python3.7/site-packages/airflow/www/views.py", line 372, in show_traceback
       else 'Error! Please contact server admin.',
     File "/Users/dstandish/.pyenv/versions/3.7.5/lib/python3.7/site-packages/flask/templating.py", line 136, in render_template
       ctx.app.update_template_context(context)
     File "/Users/dstandish/.pyenv/versions/3.7.5/lib/python3.7/site-packages/flask/app.py", line 838, in update_template_context
       context.update(func())
     File "/Users/dstandish/.pyenv/versions/3.7.5/lib/python3.7/site-packages/flask_login/utils.py", line 368, in _user_context_processor
       return dict(current_user=_get_user())
     File "/Users/dstandish/.pyenv/versions/3.7.5/lib/python3.7/site-packages/flask_login/utils.py", line 335, in _get_user
       current_app.login_manager._load_user()
     File "/Users/dstandish/.pyenv/versions/3.7.5/lib/python3.7/site-packages/flask_login/login_manager.py", line 346, in _load_user
       is_missing_user_id = 'user_id' not in session
     File "/Users/dstandish/.pyenv/versions/3.7.5/lib/python3.7/site-packages/werkzeug/local.py", line 379, in <lambda>
       __contains__ = lambda x, i: i in x._get_current_object()
   TypeError: argument of type 'NoneType' is not iterable
   127.0.0.1 - - [13/Dec/2020:11:37:36 -0800] "GET /home HTTP/1.1" 500 0 "-" "-"
   ^C[2020-12-13 11:37:38,288] {webserver_command.py:430} INFO - Received signal: 2. Closing gunicorn.
   [2020-12-13 11:37:38 -0800] [85061] [INFO] Handling signal: int
   [2020-12-13 11:37:38 -0800] [85064] [INFO] Worker exiting (pid: 85064)
   [2020-12-13 11:37:38 -0800] [85061] [INFO] Shutting down: Master
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on issue #13046: installation of simplejson breaks airlfow webserver 2.0.0rc2

Posted by GitBox <gi...@apache.org>.
ashb commented on issue #13046:
URL: https://github.com/apache/airflow/issues/13046#issuecomment-744067723


   (Wrong issued linked -- ignore that PR)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish edited a comment on issue #13046: installation of simplejson breaks airlfow webserver 2.0.0rc2

Posted by GitBox <gi...@apache.org>.
dstandish edited a comment on issue #13046:
URL: https://github.com/apache/airflow/issues/13046#issuecomment-744062618


   In practice I think for most implementations airflow webserver uses same exact `requirements.txt` file or docker image as scheduler.
   
   E.g. if you look at airlfow helm chart, there is only one airflow image --- not one for scheduler, another for webserver.
   
   So even though you only use it in task execution, the mere fact of its inclusion means webserver will use it and webserver will break.
   
   But this is interesting idea -- maybe there is some way to make it so webserver refuses to import simplejson even if it exists in the environment 🤔.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on issue #13046: installation of simplejson breaks airlfow webserver 2.0.0rc2

Posted by GitBox <gi...@apache.org>.
ashb commented on issue #13046:
URL: https://github.com/apache/airflow/issues/13046#issuecomment-744071667


   What version of flask do you have @dstandish?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on issue #13046: installation of simplejson breaks airlfow webserver 2.0.0rc2

Posted by GitBox <gi...@apache.org>.
dstandish commented on issue #13046:
URL: https://github.com/apache/airflow/issues/13046#issuecomment-744061971


   > This should not be considered as Airflow's bug, especially we assume different applications should be installed in isolated virtual envs.
   
   Is this really an assumption?  Airflow is a both an app and a development framework.  We do provide PythonVirtualenvOperator, but most operators are not in virtualenv.  And users routinely make custom operators, installing additional dependencies alongside airflow.  
   
   And if your operator uses a library which installs simplejson, you will break airflow.
   
   Are we also sure that, among all the providers we have, there is not _one_ that installs simplejson?
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish edited a comment on issue #13046: installation of simplejson breaks airlfow webserver 2.0.0rc2

Posted by GitBox <gi...@apache.org>.
dstandish edited a comment on issue #13046:
URL: https://github.com/apache/airflow/issues/13046#issuecomment-744061971


   > This should not be considered as Airflow's bug, especially we assume different applications should be installed in isolated virtual envs.
   
   Is this really an assumption?  Airflow is a both an app and a development framework.  We do provide PythonVirtualenvOperator, but most operators are not run in separate virtualenv.  And users routinely make custom operators, installing additional dependencies alongside airflow.  
   
   And if your operator uses a library which installs simplejson, you will break airflow.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish edited a comment on issue #13046: installation of simplejson breaks airlfow webserver 2.0.0rc2

Posted by GitBox <gi...@apache.org>.
dstandish edited a comment on issue #13046:
URL: https://github.com/apache/airflow/issues/13046#issuecomment-744062618


   In practice I think airflow webserver uses same exact `requirements.txt` file or docker image as scheduler.  But this is interesting idea -- maybe we can make it so webserver refuses to import simplejson even if it exists in the environment 🤔


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish edited a comment on issue #13046: installation of simplejson breaks airlfow webserver 2.0.0rc2

Posted by GitBox <gi...@apache.org>.
dstandish edited a comment on issue #13046:
URL: https://github.com/apache/airflow/issues/13046#issuecomment-744061971


   > This should not be considered as Airflow's bug, especially we assume different applications should be installed in isolated virtual envs.
   
   Is this really an assumption?  Airflow is a both an app and a development framework.  We do provide PythonVirtualenvOperator, but most operators are not run in separate virtualenv.  And users routinely make custom operators, installing additional dependencies alongside airflow.  
   
   And if your operator uses a library which installs simplejson, you will break airflow.
   
   Are we also sure that, among all the providers we have, there is not _one_ that installs simplejson?
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish edited a comment on issue #13046: installation of simplejson breaks airlfow webserver 2.0.0rc2

Posted by GitBox <gi...@apache.org>.
dstandish edited a comment on issue #13046:
URL: https://github.com/apache/airflow/issues/13046#issuecomment-744062618


   In practice I think airflow webserver uses same exact `requirements.txt` file or docker image as scheduler.  But this is interesting idea -- maybe there is some way to make it so webserver refuses to import simplejson even if it exists in the environment 🤔


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on issue #13046: installation of simplejson breaks airlfow webserver 2.0.0rc2

Posted by GitBox <gi...@apache.org>.
dstandish commented on issue #13046:
URL: https://github.com/apache/airflow/issues/13046#issuecomment-744076667


   Oh, it's not... it's the upstream flask code that somehow ends up invoking this class 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on issue #13046: installation of simplejson breaks airlfow webserver 2.0.0rc2

Posted by GitBox <gi...@apache.org>.
dstandish commented on issue #13046:
URL: https://github.com/apache/airflow/issues/13046#issuecomment-744077095


   This hack at least lets webserver run... though, not sure if it has negative effects:
   ```python
   class AirflowJsonEncoder(json.JSONEncoder):
       """Custom Airflow json encoder implementation."""
   
       def __init__(self, *args, **kwargs):
           allowed_kwargs = {
               'skipkeys',
               'ensure_ascii',
               'check_circular',
               'allow_nan',
               'sort_keys',
               'indent',
               'separators',
               'default',
           }
           super().__init__(*args, **{k: kwargs[k] for k in allowed_kwargs})
           self.default = self._default
   
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on issue #13046: installation of simplejson breaks airlfow webserver 2.0.0rc2

Posted by GitBox <gi...@apache.org>.
ashb commented on issue #13046:
URL: https://github.com/apache/airflow/issues/13046#issuecomment-744073333


   Ah, from here in simplejson:
   
   ```
           iterable = cls(skipkeys=skipkeys, ensure_ascii=ensure_ascii,
               check_circular=check_circular, allow_nan=allow_nan, indent=indent,
               separators=separators, encoding=encoding,
               default=default, use_decimal=use_decimal,
               namedtuple_as_object=namedtuple_as_object,
               tuple_as_array=tuple_as_array,
               iterable_as_array=iterable_as_array,
               bigint_as_string=bigint_as_string,
               sort_keys=sort_keys,
               item_sort_key=item_sort_key,
               for_json=for_json,
               ignore_nan=ignore_nan,
               int_as_string_bitcount=int_as_string_bitcount,
               **kw).iterencode(obj)
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] XD-DENG commented on issue #13046: installation of simplejson breaks airlfow webserver 2.0.0rc2

Posted by GitBox <gi...@apache.org>.
XD-DENG commented on issue #13046:
URL: https://github.com/apache/airflow/issues/13046#issuecomment-744062418


   For the cases you listed, _webserver_ is not involved. To what I understand from the issue you shared earlier above, it's only "conflicting" with _webserver_, not _executor_ or _LocalJob_, etc.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on issue #13046: installation of simplejson breaks airlfow webserver 2.0.0rc2

Posted by GitBox <gi...@apache.org>.
ashb commented on issue #13046:
URL: https://github.com/apache/airflow/issues/13046#issuecomment-744071994


   https://github.com/pallets/flask/commit/c43edfc7c08b8dbe883761bec14fa75ee7db1bf3 -- claims that flask doesn't use simplejson anymore.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] XD-DENG edited a comment on issue #13046: installation of simplejson breaks airlfow webserver 2.0.0rc2

Posted by GitBox <gi...@apache.org>.
XD-DENG edited a comment on issue #13046:
URL: https://github.com/apache/airflow/issues/13046#issuecomment-744062418


   For the cases you listed, _webserver_ is not involved. To what I understand from the issue you shared earlier above, `simplejson` is only "conflicting" with _webserver_, not _executor_ or _LocalJob_, etc.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish edited a comment on issue #13046: installation of simplejson breaks airlfow webserver 2.0.0rc2

Posted by GitBox <gi...@apache.org>.
dstandish edited a comment on issue #13046:
URL: https://github.com/apache/airflow/issues/13046#issuecomment-744062618


   In practice I think airflow webserver uses same exact `requirements.txt` file or docker image as scheduler.  But this is interesting idea -- maybe we can make it so webserver refuses to import simplejson even if it exists 🤔


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on issue #13046: installation of simplejson breaks airlfow webserver 2.0.0rc2

Posted by GitBox <gi...@apache.org>.
dstandish commented on issue #13046:
URL: https://github.com/apache/airflow/issues/13046#issuecomment-744078517


   This is also a possible solution:
   ```python
   try:
       import simplejson
   except ImportError:
       simplejson = None
   
   _JSONEncoder = simplejson.JSONEncoder if simplejson else json.JSONEncoder
   
   
   class AirflowJsonEncoder(_JSONEncoder):
       """Custom Airflow json encoder implementation."""
   
       def __init__(self, *args, **kwargs):
           super().__init__(*args, **kwargs)
           self.default = self._default
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish edited a comment on issue #13046: installation of simplejson breaks airlfow webserver 2.0.0rc2

Posted by GitBox <gi...@apache.org>.
dstandish edited a comment on issue #13046:
URL: https://github.com/apache/airflow/issues/13046#issuecomment-744078517


   This is also a possible solution:
   ```python
   try:
       import simplejson
   except ImportError:
       simplejson = None
   
   _JSONEncoder = simplejson.JSONEncoder if simplejson else json.JSONEncoder
   
   
   class AirflowJsonEncoder(_JSONEncoder):
       """Custom Airflow json encoder implementation."""
   
       def __init__(self, *args, **kwargs):
           super().__init__(*args, **kwargs)
           self.default = self._default
   ```
   
   Since flask will use simplejson when present, we could adapt the behavior of our encoder when it is.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish removed a comment on issue #13046: installation of simplejson breaks airlfow webserver 2.0.0rc2

Posted by GitBox <gi...@apache.org>.
dstandish removed a comment on issue #13046:
URL: https://github.com/apache/airflow/issues/13046#issuecomment-744077095






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb closed issue #13046: installation of simplejson breaks airlfow webserver 2.0.0rc2

Posted by GitBox <gi...@apache.org>.
ashb closed issue #13046:
URL: https://github.com/apache/airflow/issues/13046


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on issue #13046: installation of simplejson breaks airlfow webserver 2.0.0rc2

Posted by GitBox <gi...@apache.org>.
dstandish commented on issue #13046:
URL: https://github.com/apache/airflow/issues/13046#issuecomment-744080511


   Sorry, last one... i think this is most reasonable:
   ```python
   Or this, to ensure we use whatever flask is using:
   ```python
   from flask.json import JSONEncoder
   ...
   class AirflowJsonEncoder(JSONEncoder):
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish edited a comment on issue #13046: installation of simplejson breaks airlfow webserver 2.0.0rc2

Posted by GitBox <gi...@apache.org>.
dstandish edited a comment on issue #13046:
URL: https://github.com/apache/airflow/issues/13046#issuecomment-744080511


   Sorry, last one... i think this is most reasonable:
   ```python
   from flask.json import JSONEncoder
   ...
   class AirflowJsonEncoder(JSONEncoder):
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] XD-DENG commented on issue #13046: installation of simplejson breaks airlfow webserver 2.0.0rc2

Posted by GitBox <gi...@apache.org>.
XD-DENG commented on issue #13046:
URL: https://github.com/apache/airflow/issues/13046#issuecomment-744059563


   This should not be considered as a bug.
   
   `simplejson` implements `JSONEncoder` in incompatible way from the `JSONEncoder` of Python built-in standard library `json` (which is used by Airflow). Meanwhile, `simplejson` is NOT Airflow's dependency.
   
   This should  not be considered as Airflow's bug, especially we assume different applications should be installed in isolated virtual envs.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on issue #13046: installation of simplejson breaks airlfow webserver 2.0.0rc2

Posted by GitBox <gi...@apache.org>.
dstandish commented on issue #13046:
URL: https://github.com/apache/airflow/issues/13046#issuecomment-744076303


   One thing that is curious to me is that fundamentally the error occurs in init of `AirflowJsonEncoder`, when it forwards kwargs to parent `json.JSONEncoder`.
   
   Why is `json.JSONEncoder` affected by `simplejson`?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on issue #13046: installation of simplejson breaks airlfow webserver 2.0.0rc2

Posted by GitBox <gi...@apache.org>.
ashb commented on issue #13046:
URL: https://github.com/apache/airflow/issues/13046#issuecomment-744071104


    > Deprecated since version 3.1, will be removed in version 3.9: encoding keyword argument.
   
   So arguably simplejson is just a bit eager in not supporting that argument.
   
   We should fix this, but it's not big enough to stop 2.0.0 release.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] XD-DENG edited a comment on issue #13046: installation of simplejson breaks airlfow webserver 2.0.0rc2

Posted by GitBox <gi...@apache.org>.
XD-DENG edited a comment on issue #13046:
URL: https://github.com/apache/airflow/issues/13046#issuecomment-744059563


   `simplejson` implements `JSONEncoder` in incompatible way from the `JSONEncoder` of Python built-in standard library `json` (which is used by Airflow). Meanwhile, `simplejson` is NOT Airflow's dependency.
   
   This should  not be considered as Airflow's bug, especially we assume different applications should be installed in isolated virtual envs.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on issue #13046: installation of simplejson breaks airlfow webserver 2.0.0rc2

Posted by GitBox <gi...@apache.org>.
dstandish commented on issue #13046:
URL: https://github.com/apache/airflow/issues/13046#issuecomment-744080118


   Or this, to ensure we use whatever flask is using:
   ```python
   from flask.json import _json
   
   try:
       from kubernetes.client import models as k8s
   except ImportError:
       k8s = None
   
   
   class AirflowJsonEncoder(_json.JSONEncoder):
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb edited a comment on issue #13046: installation of simplejson breaks airlfow webserver 2.0.0rc2

Posted by GitBox <gi...@apache.org>.
ashb edited a comment on issue #13046:
URL: https://github.com/apache/airflow/issues/13046#issuecomment-744071994


   https://github.com/pallets/flask/commit/c43edfc7c08b8dbe883761bec14fa75ee7db1bf3 -- claims that flask doesn't use simplejson anymore. Edit. Oh: Flask 2.0, which is not yet released.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on issue #13046: installation of simplejson breaks airlfow webserver 2.0.0rc2

Posted by GitBox <gi...@apache.org>.
dstandish commented on issue #13046:
URL: https://github.com/apache/airflow/issues/13046#issuecomment-744062618


   Generally speaking airflow webserver uses same exact `requirements.txt` file or docker image as scheduler.  But this is interesting idea -- maybe we can make it so webserver refuses to import simplejson even if it exists 🤔


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish removed a comment on issue #13046: installation of simplejson breaks airlfow webserver 2.0.0rc2

Posted by GitBox <gi...@apache.org>.
dstandish removed a comment on issue #13046:
URL: https://github.com/apache/airflow/issues/13046#issuecomment-744080118


   Or this, to ensure we use whatever flask is using:
   ```python
   from flask.json import _json
   ...
   class AirflowJsonEncoder(_json.JSONEncoder):
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on issue #13046: installation of simplejson breaks airlfow webserver 2.0.0rc2

Posted by GitBox <gi...@apache.org>.
ashb commented on issue #13046:
URL: https://github.com/apache/airflow/issues/13046#issuecomment-744071308


   Wait -- I don't even see where in the callstack the `encoding` argument is actually coming from.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on issue #13046: installation of simplejson breaks airlfow webserver 2.0.0rc2

Posted by GitBox <gi...@apache.org>.
dstandish commented on issue #13046:
URL: https://github.com/apache/airflow/issues/13046#issuecomment-744072335


   > What version of flask do you have @dstandish?
   
   1.1.2


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on issue #13046: installation of simplejson breaks airlfow webserver 2.0.0rc2

Posted by GitBox <gi...@apache.org>.
ashb commented on issue #13046:
URL: https://github.com/apache/airflow/issues/13046#issuecomment-744073378


   @dstandish This is a bug in simplejson :(


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #13046: installation of simplejson breaks airlfow webserver 2.0.0rc2

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #13046:
URL: https://github.com/apache/airflow/issues/13046#issuecomment-744065890


   I agree that in most cases there should be the same image used for all containers. I know for security reasons cloud providers are making they webserver slightly different than scheduler /workers (for example both Composer and MWAW disallow plugins for the webserver) but i think it would be great to protect against simplejson breaking airflow webserver in non-cloud mode.
   
   In Airflow 2.0 we allow custom providers in Airflow and they can modify webserver behaviour in a more secure way - they have to provide 'airflow.provider' entry point and this can be verified at installation time by the providers (like only allow built-in providers and do not allow any other packages that have that entry point defined). I am not sure if that is 'secure enough' for the cloud providers (will check with composer) but i imagine in this case just installing a package with simplejson as dependency would break webserver.
   
   And for on premise installation i think sharing same image is rather common. So protecting against that might be a good idea. 
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish edited a comment on issue #13046: installation of simplejson breaks airlfow webserver 2.0.0rc2

Posted by GitBox <gi...@apache.org>.
dstandish edited a comment on issue #13046:
URL: https://github.com/apache/airflow/issues/13046#issuecomment-744062618


   In practice I think for most implementations airflow webserver uses same exact `requirements.txt` file or docker image as scheduler.
   
   E.g. if you look at airlfow helm chart, there is only one airflow image --- not one for scheduler, another for webserver.
   
   So even though you only use it in task execution, the mere fact of its inclusion means webserver will use it and webserver will break.
   
   But this is interesting idea -- maybe there is some way to make it so webserver refuses to import simplejson even if it exists in the environment 🤔


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish edited a comment on issue #13046: installation of simplejson breaks airlfow webserver 2.0.0rc2

Posted by GitBox <gi...@apache.org>.
dstandish edited a comment on issue #13046:
URL: https://github.com/apache/airflow/issues/13046#issuecomment-744062618


   In practice I think for most implementations airflow webserver uses same exact `requirements.txt` file or docker image as scheduler.
   
   E.g. if you look at airlfow helm chart, there is only one airflow image --- not one for scheduler, another for webserver.
   
   So even though you only need it in task execution, the mere fact of its inclusion means webserver will use it and webserver will break.
   
   But this is interesting idea -- maybe there is some way to make it so webserver refuses to import simplejson even if it exists in the environment 🤔.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on issue #13046: installation of simplejson breaks airlfow webserver 2.0.0rc2

Posted by GitBox <gi...@apache.org>.
ashb commented on issue #13046:
URL: https://github.com/apache/airflow/issues/13046#issuecomment-744073088


   Question still remains -- where in this call stack is `encoding` actually coming from? `dumps(self.tag(value), separators=(",", ":"))` is a fixed list of arguments, and dumps on line 211 has this
   
   ```python
       _dump_arg_defaults(kwargs, app=app)
       encoding = kwargs.pop("encoding", None)
       rv = _json.dumps(obj, **kwargs)
   ```
   
   ```
     File "/Users/dstandish/.pyenv/versions/3.7.5/lib/python3.7/site-packages/itsdangerous/serializer.py", line 13, in is_text_serializer
       return isinstance(serializer.dumps({}), text_type)
     File "/Users/dstandish/.pyenv/versions/3.7.5/lib/python3.7/site-packages/flask/json/tag.py", line 305, in dumps
       return dumps(self.tag(value), separators=(",", ":"))
     File "/Users/dstandish/.pyenv/versions/3.7.5/lib/python3.7/site-packages/flask/json/__init__.py", line 211, in dumps
       rv = _json.dumps(obj, **kwargs)
     File "/Users/dstandish/.pyenv/versions/3.7.5/lib/python3.7/site-packages/simplejson/__init__.py", line 412, in dumps
       **kw).encode(obj)
     File "/Users/dstandish/.pyenv/versions/3.7.5/lib/python3.7/site-packages/airflow/utils/json.py", line 36, in __init__
       super().__init__(*args, **kwargs)
   TypeError: __init__() got an unexpected keyword argument 'encoding'
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish edited a comment on issue #13046: installation of simplejson breaks airlfow webserver 2.0.0rc2

Posted by GitBox <gi...@apache.org>.
dstandish edited a comment on issue #13046:
URL: https://github.com/apache/airflow/issues/13046#issuecomment-744080118


   Or this, to ensure we use whatever flask is using:
   ```python
   from flask.json import _json
   ...
   class AirflowJsonEncoder(_json.JSONEncoder):
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish edited a comment on issue #13046: installation of simplejson breaks airlfow webserver 2.0.0rc2

Posted by GitBox <gi...@apache.org>.
dstandish edited a comment on issue #13046:
URL: https://github.com/apache/airflow/issues/13046#issuecomment-744061971


   > This should not be considered as Airflow's bug, especially we assume different applications should be installed in isolated virtual envs.
   
   Is this really an assumption?  Airflow is a both an app and a development framework.  We do provide PythonVirtualenvOperator, but most operators are not run in separate virtualenv.  And users routinely make custom operators, installing additional dependencies alongside airflow.  
   
   And if your operator uses a library which installs simplejson, you will break airflow.
   
   True simplejson is not airflow dependency, but Flask is, and it has conditional behavior based on presence of simplejson that, for whatever reason, breaks if it is present.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish edited a comment on issue #13046: installation of simplejson breaks airlfow webserver 2.0.0rc2

Posted by GitBox <gi...@apache.org>.
dstandish edited a comment on issue #13046:
URL: https://github.com/apache/airflow/issues/13046#issuecomment-744080511


   OK i think we may be able to just use whatever flask is using:
   ```python
   from flask.json import JSONEncoder
   ...
   class AirflowJsonEncoder(JSONEncoder):
   ```
   
   Created PR in case this is desirable: https://github.com/apache/airflow/pull/13050


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish edited a comment on issue #13046: installation of simplejson breaks airlfow webserver 2.0.0rc2

Posted by GitBox <gi...@apache.org>.
dstandish edited a comment on issue #13046:
URL: https://github.com/apache/airflow/issues/13046#issuecomment-744062618


   In practice I think airflow webserver uses same exact `requirements.txt` file or docker image as scheduler.
   
   So even though you only use it in task execution, the mere fact of its inclusion means webserver will use it and webserver will break.
   
   But this is interesting idea -- maybe there is some way to make it so webserver refuses to import simplejson even if it exists in the environment 🤔


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org