You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Joe McDonnell (Code Review)" <ge...@cloudera.org> on 2021/06/02 21:20:40 UTC

[Impala-ASF-CR] IMPALA-10489: Implement JWT support

Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/17435 )

Change subject: IMPALA-10489: Implement JWT support
......................................................................


Patch Set 9:

(21 comments)

I have made a first full pass over this.

I think this provides a level of functionality that will allow us to develop the client side like Impyla and others. There are pieces of functionality that can be expanded in later changes (more than RSA algorithms, support more JWKS options, etc). Looking at the RFCs, we may be requiring fields that are optional, etc.

http://gerrit.cloudera.org:8080/#/c/17435/9/be/src/rpc/authentication.cc
File be/src/rpc/authentication.cc:

http://gerrit.cloudera.org:8080/#/c/17435/9/be/src/rpc/authentication.cc@156
PS9, Line 156: DEFINE_string(jwt_custom_claim_username
Do we have any code that would validate this value? (i.e. it should not be the empty string)


http://gerrit.cloudera.org:8080/#/c/17435/9/be/src/rpc/authentication.cc@156
PS9, Line 156: "Custom claim 'username'"
Nit: I would expand on this description to note that this specifies the custom claim in the JWT that contains the username for the session.


http://gerrit.cloudera.org:8080/#/c/17435/9/be/src/rpc/authentication.cc@1432
PS9, Line 1432:     if (use_saml) {
              :       SecureAuthProvider* sap = NULL;
              :       external_http_auth_provider_.reset(sap = new SecureAuthProvider(false));
              :       sap->InitSaml();
              :       LOG(INFO) <<
              :           "External communication is authenticated for hs2-http protocol with SAML2 SSO";
              :     } else {
              :       LOG(INFO) << "External communication is not authenticated for hs2-http protocol";
              :     }
I think we want to be able to have JWT as the only authentication method (without LDAP or anything else), which means that this code needs to detect JWT support and use SecureAuthProvider() like it does for SAML.


http://gerrit.cloudera.org:8080/#/c/17435/9/be/src/transport/THttpServer.cpp
File be/src/transport/THttpServer.cpp:

http://gerrit.cloudera.org:8080/#/c/17435/9/be/src/transport/THttpServer.cpp@278
PS9, Line 278:         } else if (use_jwt_token_
             :             && stripped_bearer_auth_token.find('.') != string::npos) {
             :           // Since Impala supports multiple auth mechanism in parallel, falls back to JWT.
             :           if (callbacks_.jwt_token_auth_fn(stripped_bearer_auth_token)) {
             :             authorized = true;
             :             if (metrics_enabled_) {
             :               http_metrics_->total_jwt_token_auth_success_->Increment(1);
             :             }
             :           }
The other way to do this is to move the below JWT auth section above this "if (!authorized && has_saml_)" section. The JWT section would always fall through to the SAML section if it fails.


http://gerrit.cloudera.org:8080/#/c/17435/9/be/src/util/jwt-util-test.cc
File be/src/util/jwt-util-test.cc:

http://gerrit.cloudera.org:8080/#/c/17435/9/be/src/util/jwt-util-test.cc@145
PS9, Line 145:   // Delete this temporary file
             :   void Delete();
If this isn't used directly, then you can make it private.


http://gerrit.cloudera.org:8080/#/c/17435/9/be/src/util/jwt-util-test.cc@149
PS9, Line 149: const char* Filename() const { return name_.c_str(); }
Since JsonWebKeySet::Init() takes the filename in as a "const std::string&", this can return a "const std::string&".


http://gerrit.cloudera.org:8080/#/c/17435/9/be/src/util/jwt-util-test.cc@169
PS9, Line 169: NULL
Nit: Use nullptr rather than NULL for new code.


http://gerrit.cloudera.org:8080/#/c/17435/9/be/src/util/jwt-util-test.cc@224
PS9, Line 224:   TempTestDataFile* jwks_file = new TempTestDataFile(
             :       "{"
             :       "  \"keys\": ["
             :       "    {"
             :       "      \"use\": \"sig\","
             :       "      \"kty\": \"RSA\","
             :       "      \"alg\": \"RS256\","
             :       "      \"n\": \"sttddbg-_yjXzcFpbMJB1fIFam9lQBeXWbTqzJwbuFbspHMsRowa8FaPw\","
             :       "      \"e\": \"AQAB\""
             :       "    }"
             :       "  ]"
             :       "}");
             :   JsonWebKeySet* jwks = new JsonWebKeySet();
Here and elsewhere in this file, use smart pointers / std::unique_ptr wherever possible rather than new/delete pairs.


http://gerrit.cloudera.org:8080/#/c/17435/9/be/src/util/jwt-util-test.cc@348
PS9, Line 348: TEST(JwtUtilTest, VerifyJwtRS512) {
             :   // Sign JWT token with RS512.
             :   TempTestDataFile jwks_file(Substitute(jwks_file_format, kid_1, "RS512", rsa512_pub_key,
             :       kid_2, "RS512", rsa512_pub_key_invalid));
             :   JsonWebKeySet* jwks = new JsonWebKeySet();
             :   Status status = jwks->Init(jwks_file.Filename());
             :   ASSERT_TRUE(status.ok());
             :   ASSERT_EQ(2, jwks->GetKeyNum());
             : 
             :   const JsonWebKey* key1 = jwks->LookupKey(kid_1);
             :   ASSERT_TRUE(key1 != nullptr);
             :   ASSERT_EQ(ADD_PUB_KEY_HEADER_FOOTER(rsa512_pub_key), key1->key_value_);
             : 
             :   // Create a JWT token and sign it with RS512.
             :   string priv_key = ADD_RSA_PRIV_KEY_HEADER_FOOTER(rsa512_priv_key);
             :   string pub_key = ADD_PUB_KEY_HEADER_FOOTER(rsa512_pub_key);
             :   auto token = jwt::create()
             :                    .set_issuer("auth0")
             :                    .set_type("JWS")
             :                    .set_algorithm("RS512")
             :                    .set_key_id(kid_1)
             :                    .set_payload_claim("username", picojson::value("impala"))
             :                    .sign(jwt::algorithm::rs512(pub_key, priv_key, "", ""));
             : 
             :   // Verify the JWT token with our wrapper API which use public key retrieved from JWKS,
             :   // and read username from the token.
             :   string username;
             :   status = VerifyJWTTokenExtractUsername(token, jwks, "username", username);
             :   ASSERT_TRUE(status.ok());
             :   ASSERT_EQ("impala", username);
             : 
             :   delete jwks;
             : }
For completeness, let's also add a similar case with RS384


http://gerrit.cloudera.org:8080/#/c/17435/9/be/src/util/jwt-util-test.cc@456
PS9, Line 456:   ASSERT_TRUE(status.ok());
When checking that status is ok, use ASSERT_OK/EXPECT_OK from testutil/gtest-util.h. That will print the not-OK status if it fails. Please go through this file and fix all the locations like this.

(The ASSERT_FALSE(status.ok()) doesn't need to change, because there isn't any interesting detail about an OK status.)


http://gerrit.cloudera.org:8080/#/c/17435/9/be/src/util/jwt-util.cc
File be/src/util/jwt-util.cc:

http://gerrit.cloudera.org:8080/#/c/17435/9/be/src/util/jwt-util.cc@103
PS9, Line 103:     if (!keys.IsArray()) {
             :       return Status(TErrorCode::JWKS_PARSE_ERROR,
             :           Substitute(
             :               "'keys' must be of type Array but is a '$0'", NameOfTypeOfJsonValue(keys)));
             :     }
Should we add an additional check that the array has non-zero Size()?


http://gerrit.cloudera.org:8080/#/c/17435/9/be/src/util/jwt-util.cc@131
PS9, Line 131:     for (Value::ConstMemberIterator member = json_key.MemberBegin();
             :          member != json_key.MemberEnd(); ++member) {
For this for loop, you might just insert all the values into an unordered_map and put the actual logic for interpreting the values below the for loop.

Everything is a string, so there aren't typing issues. We are doing this once at startup, so this isn't very performance sensitive.


http://gerrit.cloudera.org:8080/#/c/17435/9/be/src/util/jwt-util.cc@133
PS9, Line 133:      if (strcmp("use", member->name.GetString()) == 0) {
             :         RETURN_IF_ERROR(ReadKeyProperty("use", json_key, &key_use, /*required*/ false));
Do we care about what "use" is set to? Do we want to require any specific value?


http://gerrit.cloudera.org:8080/#/c/17435/9/be/src/util/jwt-util.cc@146
PS9, Line 146: else if (strcmp("n", member->name.GetString()) == 0) {
             :         found_kvalue = true;
             :         RETURN_IF_ERROR(ReadKeyProperty("n", json_key, &key_value));
             :         if (key_value.empty()) {
             :           return Status("'n' property must be a non-empty string");
             :         }
This works for RSA. Other encryption types like elliptic curve require other parameters. When we support that, I'm guessing we can just add the others here and then do the validation of the combinations down at the if (!found_kid || !found_kvalue) condition below.


http://gerrit.cloudera.org:8080/#/c/17435/9/be/src/util/jwt-util.cc@204
PS9, Line 204: NULL
Use "nullptr" in new code rather than NULL


http://gerrit.cloudera.org:8080/#/c/17435/9/be/src/util/jwt-util.cc@249
PS9, Line 249: VerifyJWTTokenExtractUsername
From a style point of view, I think I would split up this function. I would do something like:
1. Define a type for the decoded JWT token
2. Have one function that parses the JSON into that JWT token type
3. Have a function that verifies the JWT. I think I would have a Verify() function on the JWKS structure (maybe that calls down to a Verify() function on the individual JWK).
4. A final function that reads the username (or just leave this in the JWTTokenAuth() function).


http://gerrit.cloudera.org:8080/#/c/17435/9/be/src/util/jwt-util.cc@282
PS9, Line 282:       if (strncasecmp(algorithm.c_str(), jwk->algorithm_.c_str(), algorithm.length())
             :           != 0) {
To be careful, compare the lengths as well to make sure this isn't just matching a prefix (or somehow algorithm.length() == 0).


http://gerrit.cloudera.org:8080/#/c/17435/9/be/src/util/jwt-util.cc@290
PS9, Line 290:      if (strncasecmp(algorithm.c_str(), "rs256", algorithm.length()) == 0) {
             :         jwt::verify()
             :             .allow_algorithm(jwt::algorithm::rs256(jwk->key_value_, "", "", ""))
             :             .with_issuer(issuer)
             :             .verify(decoded_token);
             :       } else if (strncasecmp(algorithm.c_str(), "rs384", algorithm.length()) == 0) {
             :         jwt::verify()
             :             .allow_algorithm(jwt::algorithm::rs384(jwk->key_value_, "", "", ""))
             :             .with_issuer(issuer)
             :             .verify(decoded_token);
             :       } else if (strncasecmp(algorithm.c_str(), "rs512", algorithm.length()) == 0) {
             :         jwt::verify()
             :             .allow_algorithm(jwt::algorithm::rs512(jwk->key_value_, "", "", ""))
             :             .with_issuer(issuer)
             :             .verify(decoded_token);
One performance thing we may want to do is to construct the verifier once per JWK and then just use it to call verify on each new JWT. That doesn't need to happen in this change, but it may have a performance impact.


http://gerrit.cloudera.org:8080/#/c/17435/9/be/src/util/jwt-util.cc@312
PS9, Line 312:     // Get value of custom claim 'username' from the token payload.
             :     if (decoded_token.has_payload_claim(jwt_custom_claim_username)) {
             :       // Assume the claim data type of 'username' is string.
             :       username.assign(
             :           decoded_token.get_payload_claim(jwt_custom_claim_username).to_json().to_str());
             :     }
What happens if the username claim is not present? Should we return an error here? Similarly, what if the username is the empty string?


http://gerrit.cloudera.org:8080/#/c/17435/9/fe/src/test/java/org/apache/impala/customcluster/LdapHS2Test.java
File fe/src/test/java/org/apache/impala/customcluster/LdapHS2Test.java:

http://gerrit.cloudera.org:8080/#/c/17435/9/fe/src/test/java/org/apache/impala/customcluster/LdapHS2Test.java@421
PS9, Line 421:   /**
             :    * Tests if sessions are authenticated by verifying the JWT token for connections
             :    * to the HTTP hiveserver2 endpoint.
             :    * Since we don't have Java version of JWT library, we use pre-calculated JWT token
             :    * and JWKS. The token and JWK set used in this test case were generated by using
             :    * BE unit-test function JwtUtilTest::VerifyJwtRS256.
             :    */
             :   @Test
             :   public void testHiveserver2JwtAuth() throws Exception {
             :     String jwksFilename =
             :         new File(System.getenv("IMPALA_HOME"), "testdata/jwt/jwks_rs256.json").getPath();
             :     setUp(String.format(
             :         "--jwt_token_auth=true --jwt_validate_signature=true --jwks_file_path=%s "
             :             + "--jwt_allow_without_tls=true",
             :         jwksFilename));
             :     verifyMetrics(0, 0);
             :     THttpClient transport = new THttpClient("http://localhost:28000");
             :     Map<String, String> headers = new HashMap<String, String>();
             : 
             :     // Case 1: Authenticate with valid JWT Token in HTTP header.
             :     String jwtToken =
             :         "eyJhbGciOiJSUzI1NiIsImtpZCI6InB1YmxpYzpjNDI0YjY3Yi1mZTI4LTQ1ZDctYjAxNS1m"
             :         + "NzlkYTUwYjViMjEiLCJ0eXAiOiJKV1MifQ.eyJpc3MiOiJhdXRoMCIsInVzZXJuYW1lIjoia"
             :         + "W1wYWxhIn0.OW5H2SClLlsotsCarTHYEbqlbRh43LFwOyo9WubpNTwE7hTuJDsnFoVrvHiWI"
             :         + "02W69TZNat7DYcC86A_ogLMfNXagHjlMFJaRnvG5Ekag8NRuZNJmHVqfX-qr6x7_8mpOdU55"
             :         + "4kc200pqbpYLhhuK4Qf7oT7y9mOrtNrUKGDCZ0Q2y_mizlbY6SMg4RWqSz0RQwJbRgXIWSgc"
             :         + "bZd0GbD_MQQ8x7WRE4nluU-5Fl4N2Wo8T9fNTuxALPiuVeIczO25b5n4fryfKasSgaZfmk0C"
             :         + "oOJzqbtmQxqiK9QNSJAiH2kaqMwLNgAdgn8fbd-lB1RAEGeyPH8Px8ipqcKsPk0bg";
             :     headers.put("Authorization", "Bearer " + jwtToken);
             :     headers.put("X-Forwarded-For", "127.0.0.1");
             :     transport.setCustomHeaders(headers);
             :     transport.open();
             :     TCLIService.Iface client = new TCLIService.Client(new TBinaryProtocol(transport));
             : 
             :     // Open a session which will get username 'impala' from JWT token and use it as
             :     // login user.
             :     TOpenSessionReq openReq = new TOpenSessionReq();
             :     TOpenSessionResp openResp = client.OpenSession(openReq);
             :     // One successful authentication.
             :     verifyMetrics(0, 0);
             :     verifyJwtAuthMetrics(1, 0);
             :     // Running a query should succeed.
             :     TOperationHandle operationHandle = execAndFetch(
             :         client, openResp.getSessionHandle(), "select logged_in_user()", "impala");
             :     // Two more successful authentications - for the Exec() and the Fetch().
             :     verifyMetrics(0, 0);
             :     verifyJwtAuthMetrics(3, 0);
             : 
             :     // case 2: Authenticate fails with invalid JWT token which does not have signature.
             :     String invalidJwtToken =
             :         "eyJhbGciOiJSUzI1NiIsImtpZCI6InB1YmxpYzpjNDI0YjY3Yi1mZTI4LTQ1ZDctYjAxNS1m"
             :         + "NzlkYTUwYjViMjEiLCJ0eXAiOiJKV1MifQ.eyJpc3MiOiJhdXRoMCIsInVzZXJuYW1lIjoia"
             :         + "W1wYWxhIn0.";
             :     headers.put("Authorization", "Bearer " + invalidJwtToken);
             :     headers.put("X-Forwarded-For", "127.0.0.1");
             :     transport.setCustomHeaders(headers);
             :     try {
             :       openResp = client.OpenSession(openReq);
             :       fail("Exception exception.");
             :     } catch (Exception e) {
             :       verifyJwtAuthMetrics(3, 1);
             :       assertEquals(e.getMessage(), "HTTP Response code: 401");
             :     }
             :   }
These new test cases are testing LDAP + JWT. LdapHS2Test::setUp() includes LDAP auth settings. I think that we primarily want to test JWT on its own. So, I would like to split this off to a new test .java file that uses only the JWT startup options.

The history of the LDAP frontend tests is that Java had an LDAP library that made this testing easy. That is the main reason these are not our usual Python custom cluster tests. If it is possible to make this a Python custom cluster test, that would be preferable, but I'm also ok with splitting that off to another code change. The Python custom cluster test might be easier when impyla has support for JWTs.


http://gerrit.cloudera.org:8080/#/c/17435/9/fe/src/test/java/org/apache/impala/customcluster/LdapWebserverTest.java
File fe/src/test/java/org/apache/impala/customcluster/LdapWebserverTest.java:

http://gerrit.cloudera.org:8080/#/c/17435/9/fe/src/test/java/org/apache/impala/customcluster/LdapWebserverTest.java@276
PS9, Line 276:   /**
             :    * Tests if sessions are authenticated by verifying the JWT token for connections
             :    * to the Web Server.
             :    * Since we don't have Java version of JWT library, we use pre-calculated JWT token
             :    * and JWKS. The token and JWK set used in this test case were generated by using
             :    * BE unit-test function JwtUtilTest::VerifyJwtRS256.
             :    */
             :   @Test
             :   public void testWebserverJwtAuth() throws Exception {
             :     String jwksFilename =
             :         new File(System.getenv("IMPALA_HOME"), "testdata/jwt/jwks_rs256.json").getPath();
             :     setUp(String.format(
             :               "--jwt_token_auth=true --jwt_validate_signature=true --jwks_file_path=%s "
             :                   + "--jwt_allow_without_tls=true",
             :               jwksFilename),
             :         "");
             : 
             :     // Case 1: Authenticate with valid JWT Token in HTTP header.
             :     String jwtToken =
             :         "eyJhbGciOiJSUzI1NiIsImtpZCI6InB1YmxpYzpjNDI0YjY3Yi1mZTI4LTQ1ZDctYjAxNS1m"
             :         + "NzlkYTUwYjViMjEiLCJ0eXAiOiJKV1MifQ.eyJpc3MiOiJhdXRoMCIsInVzZXJuYW1lIjoia"
             :         + "W1wYWxhIn0.OW5H2SClLlsotsCarTHYEbqlbRh43LFwOyo9WubpNTwE7hTuJDsnFoVrvHiWI"
             :         + "02W69TZNat7DYcC86A_ogLMfNXagHjlMFJaRnvG5Ekag8NRuZNJmHVqfX-qr6x7_8mpOdU55"
             :         + "4kc200pqbpYLhhuK4Qf7oT7y9mOrtNrUKGDCZ0Q2y_mizlbY6SMg4RWqSz0RQwJbRgXIWSgc"
             :         + "bZd0GbD_MQQ8x7WRE4nluU-5Fl4N2Wo8T9fNTuxALPiuVeIczO25b5n4fryfKasSgaZfmk0C"
             :         + "oOJzqbtmQxqiK9QNSJAiH2kaqMwLNgAdgn8fbd-lB1RAEGeyPH8Px8ipqcKsPk0bg";
             :     attemptConnection("Bearer " + jwtToken, "127.0.0.1");
             :     verifyJwtAuthMetrics(Range.closed(1L, 1L), zero);
             : 
             :     // Case 2: Failed with invalid JWT Token.
             :     String invalidJwtToken =
             :         "eyJhbGciOiJSUzI1NiIsImtpZCI6InB1YmxpYzpjNDI0YjY3Yi1mZTI4LTQ1ZDctYjAxNS1m"
             :         + "NzlkYTUwYjViMjEiLCJ0eXAiOiJKV1MifQ.eyJpc3MiOiJhdXRoMCIsInVzZXJuYW1lIjoia"
             :         + "W1wYWxhIn0.";
             :     try {
             :       attemptConnection("Bearer " + invalidJwtToken, "127.0.0.1");
             :     } catch (IOException e) {
             :       assertTrue(e.getMessage().contains("Server returned HTTP response code: 401"));
             :     }
             :     verifyJwtAuthMetrics(Range.closed(1L, 1L), Range.closed(1L, 1L));
             :   }
Same comment as on the other Java LDAP test file. It would be nice to split this off to test without the LDAP startup flags.



-- 
To view, visit http://gerrit.cloudera.org:8080/17435
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6b71fa854c9ddc8ca882878853395e1eb866143c
Gerrit-Change-Number: 17435
Gerrit-PatchSet: 9
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Andrew Sherman <as...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <jo...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Wed, 02 Jun 2021 21:20:40 +0000
Gerrit-HasComments: Yes