You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2019/01/26 18:54:00 UTC

[jira] [Commented] (AIRFLOW-2190) base_url with a subpath generates TypeError

    [ https://issues.apache.org/jira/browse/AIRFLOW-2190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16753151#comment-16753151 ] 

ASF GitHub Bot commented on AIRFLOW-2190:
-----------------------------------------

astahlman commented on pull request #4596: [AIRFLOW-2190] Fix TypeError when returning 404
URL: https://github.com/apache/airflow/pull/4596
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [ X ] My PR addresses the following [Airflow Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
     - https://issues.apache.org/jira/browse/AIRFLOW-2190
   
   ### Description
   
   - [ X ] Here are some details about my PR, including screenshots of any UI changes:
   
   When processing HTTP response headers, gunicorn checks that the name of each
   header is a string. Here's the relevant gunicorn code:
   
   From gunicorn/http/wsgi.py, line 257
   
       def process_headers(self, headers):
           for name, value in headers:
               if not isinstance(name, string_types):
                   raise TypeError('%r is not a string' % name)
   
   In Python3, `string_types` is set to the built-in `str`. For Python 2,
   it's set to `basestring`. Again, the relevant gunicorn code:
   
   From gunicorn/six.py, line 38:
   
       if PY3:
           string_types = str,
           ...
       else:
           string_types = basestring,
   
   On Python2 the `b''` syntax returns a `str`, but in Python3 it returns `bytes`.
   `bytes` != `str`, so we get the following error when returning a 404 on
   Python3:
   
       File "/usr/local/lib/python3.6/site-packages/airflow/www/app.py", line 166, in root_app
       resp(b'404 Not Found', [(b'Content-Type', b'text/plain')])
       File "/usr/local/lib/python3.6/site-packages/gunicorn/http/wsgi.py", line 261, in start_response
       self.process_headers(headers)
       File "/usr/local/lib/python3.6/site-packages/gunicorn/http/wsgi.py", line 268, in process_headers
       raise TypeError('%r is not a string' % name)
       TypeError: b'Content-Type' is not a string
   
   Dropping the `b` prefix in favor of the single-quote string syntax should work
   for both Python2 and 3, as demonstrated below:
   
       Python 3.7.2 (default, Jan 13 2019, 12:50:15)
       [Clang 10.0.0 (clang-1000.11.45.5)] on darwin
       Type "help", "copyright", "credits" or "license" for more information.
       >>> isinstance('foo', str)
       True
   
       Python 2.7.15 (default, Jan 12 2019, 21:43:48)
       [GCC 4.2.1 Compatible Apple LLVM 10.0.0 (clang-1000.11.45.5)] on darwin
       Type "help", "copyright", "credits" or "license" for more information.
       >>> isinstance('foo', basestring)
       True
   
   ### Tests
   
   - [ ] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason:
   
   Not sure if there's a clean way to unit test this middleware - open to suggestions. FWIW, I ran the webserver and confirmed that this fixes the issue in AIRFLOW-2190.
   
   ### Commits
   
   - [ X ] My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)":
     1. Subject is separated from body by a blank line
     1. Subject is limited to 50 characters (not including Jira issue reference)
     1. Subject does not end with a period
     1. Subject uses the imperative mood ("add", not "adding")
     1. Body wraps at 72 characters
     1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes how to use it.
     - When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added.
     - All the public functions and the classes in the PR contain docstrings that explain what it does
   
   ### Code Quality
   
   - [ X ] Passes `flake8`
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> base_url with a subpath generates TypeError
> -------------------------------------------
>
>                 Key: AIRFLOW-2190
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-2190
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: webserver
>    Affects Versions: 1.9.0, 1.10.1
>            Reporter: John Arnold
>            Priority: Major
>
> I'm running into what looks like a bug in airflow webserver. Running against master:
> Mar 7 18:20:13 netdocker1-eastus2 daemon INFO ca5ce9db3af6[92630]: [2018-03-07 18:20:13 +0000] [102] [ERROR] Error handling request /
> Mar 7 18:20:13 netdocker1-eastus2 daemon INFO ca5ce9db3af6[92630]: Traceback (most recent call last):
> Mar 7 18:20:13 netdocker1-eastus2 daemon INFO ca5ce9db3af6[92630]: File "/usr/local/lib/python3.6/site-packages/gunicorn/workers/sync.py", line 135, in handle
> Mar 7 18:20:13 netdocker1-eastus2 daemon INFO ca5ce9db3af6[92630]: self.handle_request(listener, req, client, addr)
> Mar 7 18:20:13 netdocker1-eastus2 daemon INFO ca5ce9db3af6[92630]: File "/usr/local/lib/python3.6/site-packages/gunicorn/workers/sync.py", line 176, in handle_request
> Mar 7 18:20:13 netdocker1-eastus2 daemon INFO ca5ce9db3af6[92630]: respiter = self.wsgi(environ, resp.start_response)
> Mar 7 18:20:13 netdocker1-eastus2 daemon INFO ca5ce9db3af6[92630]: File "/usr/local/lib/python3.6/site-packages/werkzeug/wsgi.py", line 826, in call
> Mar 7 18:20:13 netdocker1-eastus2 daemon INFO ca5ce9db3af6[92630]: return app(environ, start_response)
> Mar 7 18:20:13 netdocker1-eastus2 daemon INFO ca5ce9db3af6[92630]: File "/usr/local/lib/python3.6/site-packages/airflow/www/app.py", line 166, in root_app
> Mar 7 18:20:13 netdocker1-eastus2 daemon INFO ca5ce9db3af6[92630]: resp(b'404 Not Found', [(b'Content-Type', b'text/plain')])
> Mar 7 18:20:14 netdocker1-eastus2 daemon INFO ca5ce9db3af6[92630]: File "/usr/local/lib/python3.6/site-packages/gunicorn/http/wsgi.py", line 261, in start_response
> Mar 7 18:20:14 netdocker1-eastus2 daemon INFO ca5ce9db3af6[92630]: self.process_headers(headers)
> Mar 7 18:20:14 netdocker1-eastus2 daemon INFO ca5ce9db3af6[92630]: File "/usr/local/lib/python3.6/site-packages/gunicorn/http/wsgi.py", line 268, in process_headers
> Mar 7 18:20:14 netdocker1-eastus2 daemon INFO ca5ce9db3af6[92630]: raise TypeError('%r is not a string' % name)
> Mar 7 18:20:14 netdocker1-eastus2 daemon INFO ca5ce9db3af6[92630]: TypeError: b'Content-Type' is not a string
>  
> I just started using the base_url to put the webserver behind nginx proxy under a sub-path, elg [http://domain.com/airflow]
> I've tried following the docs for nginx proxy, i.e.
> [webserver]
> base_url = [http://localhost/airflow|http://airflow-web/airflow]
>  
> I've also tried setting the base_url to the fully-qualified endpoint:
> base_url = [https://example.com/airflow|https://domain.com/airflow]
>  
> Neither work, both give the TypeError exception.
>  
> If I remove the sub-path:
> base_url = [https://example.com|https://domain.com/]
> then the app starts and runs ok and i can access it on the host but not through the proxy.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)