You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Shixiong Zhu (JIRA)" <ji...@apache.org> on 2014/11/11 08:50:34 UTC

[jira] [Created] (YARN-2844) WebAppProxyServlet cannot handle urls which contain encoded characters

Shixiong Zhu created YARN-2844:
----------------------------------

             Summary: WebAppProxyServlet cannot handle urls which contain encoded characters
                 Key: YARN-2844
                 URL: https://issues.apache.org/jira/browse/YARN-2844
             Project: Hadoop YARN
          Issue Type: Bug
          Components: webapp
            Reporter: Shixiong Zhu
            Priority: Minor


WebAppProxyServlet has a bug about the URL encode/decode. This was found when running Spark on Yarn.

When a user accesses "http://example.com:8088/proxy/application_1415344371838_0006/executors/threadDump/?executorId=%3Cdriver%3E", WebAppProxyServlet will require "http://example.com:36429/executors/threadDump/?executorId=%25253Cdriver%25253E". But Spark Web Server expects "http://example.com:36429/executors/threadDump/?executorId=%3Cdriver%3E".

Here are problems I found in WebAppProxyServlet.

1. java.net.URI.toString returns an encoded url string. So the following code in WebAppProxyServlet should use `true` instead of `false`.
{code:java}
org.apache.commons.httpclient.URI uri = 
      new org.apache.commons.httpclient.URI(link.toString(), false);
{code}

2. [HttpServletRequest.getPathInfo()|https://docs.oracle.com/javaee/6/api/javax/servlet/http/HttpServletRequest.html#getPathInfo()] will returns a decoded string. Therefore, if the link is http://example.com:8088/proxy/application_1415344371838_0006/John%2FHunter, pathInfo will be "/application_1415344371838_0006/John/Hunter". Then the URI created in WebAppProxyServlet will be something like ".../John/Hunter", but the correct link should be ".../John%2FHunber". We can use [HttpServletRequest.getRequestURI()|https://docs.oracle.com/javaee/6/api/javax/servlet/http/HttpServletRequest.html#getRequestURI()] to get the raw path.
{code:java}
final String pathInfo = req.getPathInfo();
{code}

3. Use  wrong URI constructor. [URI(String scheme, String authority, String path, String query, String fragment)|https://docs.oracle.com/javase/7/docs/api/java/net/URI.html#URI(java.lang.String,%20java.lang.String,%20java.lang.String,%20java.lang.String,%20java.lang.String)] will encode the path and query which have already been encoded. Should use [URI(String str)|https://docs.oracle.com/javase/7/docs/api/java/net/URI.html#URI(java.lang.String)] directly since the url has already been encoded.
{code:java}
      URI toFetch = new URI(trackingUri.getScheme(), 
          trackingUri.getAuthority(),
          StringHelper.ujoin(trackingUri.getPath(), rest), req.getQueryString(),
          null);
{code}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)