You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/11/17 16:25:34 UTC

[GitHub] [beam] andreigurau opened a new pull request, #24229: Add rootCaCertificate option to SplunkIO

andreigurau opened a new pull request, #24229:
URL: https://github.com/apache/beam/pull/24229

   Transferring over a SplunkIO option (`rootCaCertificate`) from the SplunkIO implementation in DataflowTemplates [here](https://github.com/GoogleCloudPlatform/DataflowTemplates/blob/main/v1/src/main/java/com/google/cloud/teleport/splunk/SplunkIO.java), so that DataflowTemplates can use Beam's SplunkIO implementation (there is an inconsistency in that classic templates use the DataflowTemplates implementation while flex templates use Beam's implementation, and Beam's implementation seems to be missing some options).
   ------------------------
   
   Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
   
    - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`).
    - [ ] Mention the appropriate issue in your description (for example: `addresses #123`), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment `fixes #<ISSUE NUMBER>` instead.
    - [ ] Update `CHANGES.md` with noteworthy changes.
    - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/get-started-contributing/#make-the-reviewers-job-easier).
   
   To check the build health, please visit [https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md](https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md)
   
   GitHub Actions Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   [![Build python source distribution and wheels](https://github.com/apache/beam/workflows/Build%20python%20source%20distribution%20and%20wheels/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Build+python+source+distribution+and+wheels%22+branch%3Amaster+event%3Aschedule)
   [![Python tests](https://github.com/apache/beam/workflows/Python%20tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Python+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Java tests](https://github.com/apache/beam/workflows/Java%20Tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Java+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Go tests](https://github.com/apache/beam/workflows/Go%20tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Go+tests%22+branch%3Amaster+event%3Aschedule)
   
   See [CI.md](https://github.com/apache/beam/blob/master/CI.md) for more information about GitHub Actions CI.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] andreigurau commented on pull request #24229: Add rootCaCertificate option to SplunkIO

Posted by GitBox <gi...@apache.org>.
andreigurau commented on PR #24229:
URL: https://github.com/apache/beam/pull/24229#issuecomment-1320477563

   To solve the RAT issues, all I did was add `.txt` to the end of the file names, and now it works (since we are reading the contents of the file, not factoring in the file name). The only thing is that ideally in practice, these cert files should end with .crt and not .txt
   
   The other solution, which I don't like as much, is hardcoding the contents of .crt files as strings and generating them as temporary .crt files during testing, but I feel like that could create a lot of clutter in the test files


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] andreigurau commented on a diff in pull request #24229: Add rootCaCertificate option to SplunkIO

Posted by GitBox <gi...@apache.org>.
andreigurau commented on code in PR #24229:
URL: https://github.com/apache/beam/pull/24229#discussion_r1026836133


##########
sdks/java/io/splunk/src/main/java/org/apache/beam/sdk/io/splunk/HttpEventPublisher.java:
##########
@@ -349,14 +378,24 @@ private CloseableHttpClient getHttpClient(
                 ? NoopHostnameVerifier.INSTANCE
                 : new DefaultHostnameVerifier();
 
-        SSLContextBuilder sslContextBuilder = SSLContextBuilder.create();
+        SSLContext sslContext = SSLContextBuilder.create().build();
         if (disableCertificateValidation) {
           LOG.info("Certificate validation is disabled");
-          sslContextBuilder.loadTrustMaterial((TrustStrategy) (chain, authType) -> true);
+          sslContext =
+              SSLContextBuilder.create()

Review Comment:
   Honestly, I think redefining `sslContext` in this if block is the cleanest solution I can think of. The thing is that in the `else if (rootCaCertificate != null)` block, we are expecting the sslContext to already be defined so we can call it's `init` function. If we are going to use the same SSLContextBuilder instead, the code might look something like this
   ```
   SSLContextBuilder sslContextBuilder = SSLContextBuilder.create();
   SSLContext sslContext;
   if(disableCertificateValidation) {
     sslContextBuilder.loadTrustMaterial((TrustStrategy) (chain, authType) -> true);
     sslContext = sslContextBuilder.build();
   } else if (rootCaCertificate != null) {
    ...
    sslContext = sslContextBuilder.build();
    sslContext.init(...)
   } else {
    sslContext = sslContextBuilder.build();
   }
   ```
   
   In the above snippet, I'm calling ``sslContext = sslContextBuilder.build();`` multiple times. Because of that, I think it's just easier and cleaner to use another SSLContextBuilder and build the SSLContext in the if block



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] andreigurau commented on a diff in pull request #24229: Add rootCaCertificate option to SplunkIO

Posted by GitBox <gi...@apache.org>.
andreigurau commented on code in PR #24229:
URL: https://github.com/apache/beam/pull/24229#discussion_r1026836133


##########
sdks/java/io/splunk/src/main/java/org/apache/beam/sdk/io/splunk/HttpEventPublisher.java:
##########
@@ -349,14 +378,24 @@ private CloseableHttpClient getHttpClient(
                 ? NoopHostnameVerifier.INSTANCE
                 : new DefaultHostnameVerifier();
 
-        SSLContextBuilder sslContextBuilder = SSLContextBuilder.create();
+        SSLContext sslContext = SSLContextBuilder.create().build();
         if (disableCertificateValidation) {
           LOG.info("Certificate validation is disabled");
-          sslContextBuilder.loadTrustMaterial((TrustStrategy) (chain, authType) -> true);
+          sslContext =
+              SSLContextBuilder.create()

Review Comment:
    I think creating another `SSLContextBuilder` in this if block is the cleanest solution I can think of. The thing is that in the `else if (rootCaCertificate != null)` block, we are expecting the sslContext to already be defined so we can call it's `init` function. If we are going to use the same SSLContextBuilder instead thoughout the whole if/else if blocks, the code might look something like this
   ```
   SSLContextBuilder sslContextBuilder = SSLContextBuilder.create();
   SSLContext sslContext;
   if(disableCertificateValidation) {
     sslContextBuilder.loadTrustMaterial((TrustStrategy) (chain, authType) -> true);
     sslContext = sslContextBuilder.build();
   } else if (rootCaCertificate != null) {
    ...
    sslContext = sslContextBuilder.build();
    sslContext.init(...)
   } else {
    sslContext = sslContextBuilder.build();
   }
   ```
   
   In the above snippet, I'm calling ``sslContext = sslContextBuilder.build();`` multiple times. Because of that, I think it's just easier and cleaner to use another SSLContextBuilder and build the SSLContext in the if block



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] github-actions[bot] commented on pull request #24229: Add rootCaCertificate option to SplunkIO

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #24229:
URL: https://github.com/apache/beam/pull/24229#issuecomment-1318942986

   Assigning reviewers. If you would like to opt out of this review, comment `assign to next reviewer`:
   
   R: @kileys for label java.
   R: @chamikaramj for label io.
   
   Available commands:
   - `stop reviewer notifications` - opt out of the automated review tooling
   - `remind me after tests pass` - tag the comment author after tests pass
   - `waiting on author` - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)
   
   The PR bot will only process comments in the main thread (not review comments).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] chamikaramj merged pull request #24229: Add rootCaCertificate option to SplunkIO

Posted by GitBox <gi...@apache.org>.
chamikaramj merged PR #24229:
URL: https://github.com/apache/beam/pull/24229


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] chamikaramj commented on a diff in pull request #24229: Add rootCaCertificate option to SplunkIO

Posted by GitBox <gi...@apache.org>.
chamikaramj commented on code in PR #24229:
URL: https://github.com/apache/beam/pull/24229#discussion_r1025916723


##########
sdks/java/io/splunk/src/main/java/org/apache/beam/sdk/io/splunk/CustomX509TrustManager.java:
##########
@@ -0,0 +1,98 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.splunk;
+
+import java.io.IOException;
+import java.security.KeyStore;
+import java.security.KeyStoreException;
+import java.security.NoSuchAlgorithmException;
+import java.security.cert.CertificateException;
+import java.security.cert.X509Certificate;
+import javax.net.ssl.TrustManager;
+import javax.net.ssl.TrustManagerFactory;
+import javax.net.ssl.X509TrustManager;
+import org.checkerframework.checker.initialization.qual.UnknownInitialization;
+import org.checkerframework.checker.nullness.qual.Nullable;
+
+/** A Custom X509TrustManager that trusts a user provided CA and default CA's. */
+public class CustomX509TrustManager implements X509TrustManager {
+
+  private final @Nullable X509TrustManager defaultTrustManager;
+
+  private final @Nullable X509TrustManager userTrustManager;
+
+  public CustomX509TrustManager(X509Certificate userCertificate)
+      throws CertificateException, KeyStoreException, NoSuchAlgorithmException, IOException {
+    // Get Default Trust Manager
+    TrustManagerFactory trustMgrFactory =
+        TrustManagerFactory.getInstance(TrustManagerFactory.getDefaultAlgorithm());
+    trustMgrFactory.init((KeyStore) null);
+    defaultTrustManager = getX509TrustManager(trustMgrFactory.getTrustManagers());
+
+    // Create Trust Manager with user provided certificate
+    KeyStore trustStore = KeyStore.getInstance(KeyStore.getDefaultType());
+    trustStore.load(null, null);
+    trustStore.setCertificateEntry("User Provided Root CA", userCertificate);
+    trustMgrFactory = TrustManagerFactory.getInstance(TrustManagerFactory.getDefaultAlgorithm());
+    trustMgrFactory.init(trustStore);
+    userTrustManager = getX509TrustManager(trustMgrFactory.getTrustManagers());
+  }
+
+  private @Nullable X509TrustManager getX509TrustManager(
+      @UnknownInitialization CustomX509TrustManager this, TrustManager[] trustManagers) {
+    for (TrustManager tm : trustManagers) {
+      if (tm instanceof X509TrustManager) {
+        return (X509TrustManager) tm;
+      }
+    }
+    return null;
+  }
+
+  @Override
+  public void checkClientTrusted(X509Certificate[] chain, String authType)
+      throws CertificateException {
+    if (defaultTrustManager != null) {
+      defaultTrustManager.checkClientTrusted(chain, authType);

Review Comment:
   Should we fallback to using "userTrustManager" here as well ?



##########
sdks/java/io/splunk/src/main/java/org/apache/beam/sdk/io/splunk/SplunkIO.java:
##########
@@ -264,6 +269,35 @@ public Write withDisableCertificateValidation(Boolean disableCertificateValidati
           .build();
     }
 
+    /**
+     * Same as {@link Builder#withRootCaCertificatePath(ValueProvider)} but without a {@link
+     * ValueProvider}.
+     *
+     * @param rootCaCertificatePath Path to root CA certificate
+     * @return {@link Builder}
+     */
+    public Write withRootCaCertificatePath(ValueProvider<String> rootCaCertificatePath) {
+      checkArgument(
+          rootCaCertificatePath != null,
+          "withRootCaCertificatePath(rootCaCertificatePath) called with null input.");
+      return toBuilder().setRootCaCertificatePath(rootCaCertificatePath).build();
+    }
+
+    /**
+     * Method to set the root CA certificate.
+     *
+     * @param rootCaCertificatePath Path to root CA certificate
+     * @return {@link Builder}
+     */
+    public Write withRootCaCertificatePath(String rootCaCertificatePath) {
+      checkArgument(

Review Comment:
   Ditto.



##########
sdks/java/io/splunk/src/main/java/org/apache/beam/sdk/io/splunk/HttpEventPublisher.java:
##########
@@ -349,14 +378,24 @@ private CloseableHttpClient getHttpClient(
                 ? NoopHostnameVerifier.INSTANCE
                 : new DefaultHostnameVerifier();
 
-        SSLContextBuilder sslContextBuilder = SSLContextBuilder.create();
+        SSLContext sslContext = SSLContextBuilder.create().build();
         if (disableCertificateValidation) {
           LOG.info("Certificate validation is disabled");
-          sslContextBuilder.loadTrustMaterial((TrustStrategy) (chain, authType) -> true);
+          sslContext =
+              SSLContextBuilder.create()

Review Comment:
   Should we just use the builder above instead of creating another one ?



##########
sdks/java/io/splunk/src/main/java/org/apache/beam/sdk/io/splunk/SplunkEventWriter.java:
##########
@@ -396,6 +411,41 @@ private static void flushWriteFailures(
     }
   }
 
+  /**
+   * Reads a root CA certificate from GCS and returns it as raw bytes.
+   *
+   * @param filePath path to root CA cert in GCS
+   * @return raw contents of cert
+   * @throws RuntimeException thrown if not able to read or parse cert
+   */
+  public static byte[] getCertFromGcsAsBytes(String filePath) throws IOException {
+    ReadableByteChannel channel = getGcsFileByteChannel(filePath);
+    try (InputStream inputStream = Channels.newInputStream(channel)) {
+      return IOUtils.toByteArray(inputStream);
+    } catch (IOException e) {
+      throw new RuntimeException("Error when reading: " + filePath, e);
+    }
+  }
+
+  /** Handles getting the {@link ReadableByteChannel} for {@code filePath}. */
+  private static ReadableByteChannel getGcsFileByteChannel(String filePath) throws IOException {

Review Comment:
   Is filePath expected to be a glob (not the exact path) ?
   
   If it's an exact file path, this matching and verification is unnecessary and we can just assume it to be a single file and do "FileSystems.open" (and remove this method).



##########
sdks/java/io/splunk/src/main/java/org/apache/beam/sdk/io/splunk/SplunkIO.java:
##########
@@ -264,6 +269,35 @@ public Write withDisableCertificateValidation(Boolean disableCertificateValidati
           .build();
     }
 
+    /**
+     * Same as {@link Builder#withRootCaCertificatePath(ValueProvider)} but without a {@link
+     * ValueProvider}.
+     *
+     * @param rootCaCertificatePath Path to root CA certificate
+     * @return {@link Builder}
+     */
+    public Write withRootCaCertificatePath(ValueProvider<String> rootCaCertificatePath) {
+      checkArgument(

Review Comment:
   Please use CheckNotNull instead: https://github.com/GoogleCloudPlatform/DataflowTemplates/blob/d8c84c6f735848c715eef23c1af3486a1248f11c/v1/src/main/java/com/google/cloud/teleport/splunk/SplunkEvent.java#L78



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] andreigurau commented on a diff in pull request #24229: Add rootCaCertificate option to SplunkIO

Posted by GitBox <gi...@apache.org>.
andreigurau commented on code in PR #24229:
URL: https://github.com/apache/beam/pull/24229#discussion_r1026825958


##########
sdks/java/io/splunk/src/main/java/org/apache/beam/sdk/io/splunk/SplunkIO.java:
##########
@@ -264,6 +269,35 @@ public Write withDisableCertificateValidation(Boolean disableCertificateValidati
           .build();
     }
 
+    /**
+     * Same as {@link Builder#withRootCaCertificatePath(ValueProvider)} but without a {@link
+     * ValueProvider}.
+     *
+     * @param rootCaCertificatePath Path to root CA certificate
+     * @return {@link Builder}
+     */
+    public Write withRootCaCertificatePath(ValueProvider<String> rootCaCertificatePath) {
+      checkArgument(

Review Comment:
   Done (along with the other parameters in this Builder)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] chamikaramj commented on pull request #24229: Add rootCaCertificate option to SplunkIO

Posted by GitBox <gi...@apache.org>.
chamikaramj commented on PR #24229:
URL: https://github.com/apache/beam/pull/24229#issuecomment-1324398193

   LGTM. Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] andreigurau commented on pull request #24229: Add rootCaCertificate option to SplunkIO

Posted by GitBox <gi...@apache.org>.
andreigurau commented on PR #24229:
URL: https://github.com/apache/beam/pull/24229#issuecomment-1319087828

   RAT is failing since I'm adding test .crt files, which RAT doesn't seem to like/recognize. I'm not sure how to fix it though...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] chamikaramj commented on pull request #24229: Add rootCaCertificate option to SplunkIO

Posted by GitBox <gi...@apache.org>.
chamikaramj commented on PR #24229:
URL: https://github.com/apache/beam/pull/24229#issuecomment-1319457382

   Regarding RAT failure, How about generating ".crt" files from tests instead of copying them to the repo ?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] andreigurau commented on a diff in pull request #24229: Add rootCaCertificate option to SplunkIO

Posted by GitBox <gi...@apache.org>.
andreigurau commented on code in PR #24229:
URL: https://github.com/apache/beam/pull/24229#discussion_r1026825506


##########
sdks/java/io/splunk/src/main/java/org/apache/beam/sdk/io/splunk/SplunkEventWriter.java:
##########
@@ -396,6 +411,41 @@ private static void flushWriteFailures(
     }
   }
 
+  /**
+   * Reads a root CA certificate from GCS and returns it as raw bytes.
+   *
+   * @param filePath path to root CA cert in GCS
+   * @return raw contents of cert
+   * @throws RuntimeException thrown if not able to read or parse cert
+   */
+  public static byte[] getCertFromGcsAsBytes(String filePath) throws IOException {
+    ReadableByteChannel channel = getGcsFileByteChannel(filePath);
+    try (InputStream inputStream = Channels.newInputStream(channel)) {
+      return IOUtils.toByteArray(inputStream);
+    } catch (IOException e) {
+      throw new RuntimeException("Error when reading: " + filePath, e);
+    }
+  }
+
+  /** Handles getting the {@link ReadableByteChannel} for {@code filePath}. */
+  private static ReadableByteChannel getGcsFileByteChannel(String filePath) throws IOException {

Review Comment:
   Yeah, the cert is expected to be a single path. Decided to use `FileSystems.matchSingleFileSpec(filePath)` instead, since this also checks if the file path is specifically a single file



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org