You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by markgrover <gi...@git.apache.org> on 2017/02/24 01:54:11 UTC

[GitHub] spark pull request #17047: [SPARK-19720][SPARK SUBMIT] Redact sensitive info...

GitHub user markgrover opened a pull request:

    https://github.com/apache/spark/pull/17047

    [SPARK-19720][SPARK SUBMIT] Redact sensitive information from SparkSubmit console

    ## What changes were proposed in this pull request?
    This change redacts senstive information (based on `spark.redaction.regex` property)
    from the Spark Submit console logs. Such sensitive information is already being
    redacted from event logs and yarn logs, etc.
    
    ## How was this patch tested?
    Testing was done manually to make sure that the console logs were not printing any
    sensitive information.
    
    Here's some output from the console:
    
    ```
    Spark properties used, including those specified through
     --conf and those from the properties file /etc/spark2/conf/spark-defaults.conf:
      (spark.yarn.appMasterEnv.HADOOP_CREDSTORE_PASSWORD,*********(redacted))
      (spark.authenticate,false)
      (spark.executorEnv.HADOOP_CREDSTORE_PASSWORD,*********(redacted))
    ```
    
    ```
    System properties:
    (spark.yarn.appMasterEnv.HADOOP_CREDSTORE_PASSWORD,*********(redacted))
    (spark.authenticate,false)
    (spark.executorEnv.HADOOP_CREDSTORE_PASSWORD,*********(redacted))
    ```
    There is a risk if new print statements were added to the console down the road, sensitive information may still get leaked, since there is no test that asserts on the console log output. I considered it out of the scope of this JIRA to write an integration test to make sure new leaks don't happen in the future.
    
    Running unit tests to make sure nothing else is broken by this change.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/markgrover/spark master_redaction

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/17047.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #17047
    
----
commit 000efb1e3152f837e01ce1f80ae108c596f9baa5
Author: Mark Grover <ma...@apache.org>
Date:   2017-02-24T01:30:05Z

    [SPARK-19720][SPARK SUBMIT] Redact sensitive information from SparkSubmit console output
    
    This change redacts senstive information (based on spark.redaction.regex property)
    from the Spark Submit console logs. Such sensitive information is already being
    redacted from event logs and yarn logs, etc.
    
    Testing was done manually to make sure that the console logs were not printing any
    sensitive information.
    Here's some output from the console:
    Spark properties used, including those specified through
     --conf and those from the properties file /etc/spark2/conf/spark-defaults.conf:
      (spark.yarn.appMasterEnv.HADOOP_CREDSTORE_PASSWORD,*********(redacted))
      (spark.authenticate,false)
      (spark.executorEnv.HADOOP_CREDSTORE_PASSWORD,*********(redacted))

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17047: [SPARK-19720][CORE] Redact sensitive information from Sp...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17047
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73625/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17047: [SPARK-19720][SPARK SUBMIT] Redact sensitive information...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17047
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17047: [SPARK-19720][CORE] Redact sensitive information from Sp...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17047
  
    **[Test build #73626 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73626/testReport)** for PR 17047 at commit [`d6a04b9`](https://github.com/apache/spark/commit/d6a04b9c443b84c3bb33cfcc6b4a21423bef88fc).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17047: [SPARK-19720][CORE] Redact sensitive information from Sp...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17047
  
    **[Test build #73651 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73651/testReport)** for PR 17047 at commit [`d6a04b9`](https://github.com/apache/spark/commit/d6a04b9c443b84c3bb33cfcc6b4a21423bef88fc).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17047: [SPARK-19720][SPARK SUBMIT] Redact sensitive information...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17047
  
    **[Test build #73385 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73385/testReport)** for PR 17047 at commit [`000efb1`](https://github.com/apache/spark/commit/000efb1e3152f837e01ce1f80ae108c596f9baa5).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17047: [SPARK-19720][CORE] Redact sensitive information from Sp...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17047
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17047: [SPARK-19720][CORE] Redact sensitive information from Sp...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17047
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73651/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17047: [SPARK-19720][CORE] Redact sensitive information from Sp...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17047
  
    **[Test build #73625 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73625/testReport)** for PR 17047 at commit [`4edaf67`](https://github.com/apache/spark/commit/4edaf673b0c0b22dad9961a46f4aedabcadcd451).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17047: [SPARK-19720][SPARK SUBMIT] Redact sensitive information...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17047
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73385/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17047: [SPARK-19720][SPARK SUBMIT] Redact sensitive info...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17047#discussion_r103573145
  
    --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala ---
    @@ -240,14 +240,16 @@ package object config {
         .longConf
         .createWithDefault(4 * 1024 * 1024)
     
    +  private[spark] val SECRET_REDACTION_PROPERTY = "spark.redaction.regex"
    --- End diff --
    
    Actually, instead of this you could use `SECRET_REDACTION_PATTERN.key` and `SECRET_REDACTION_PATTERN.defaultValue` in `Utils.scala`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17047: [SPARK-19720][SPARK SUBMIT] Redact sensitive information...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17047
  
    **[Test build #73554 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73554/testReport)** for PR 17047 at commit [`7753998`](https://github.com/apache/spark/commit/7753998f0a21073a05897b8945c8e61a1fe4fc84).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17047: [SPARK-19720][CORE] Redact sensitive information from Sp...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17047
  
    **[Test build #73626 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73626/testReport)** for PR 17047 at commit [`d6a04b9`](https://github.com/apache/spark/commit/d6a04b9c443b84c3bb33cfcc6b4a21423bef88fc).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17047: [SPARK-19720][SPARK SUBMIT] Redact sensitive information...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17047
  
    **[Test build #73554 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73554/testReport)** for PR 17047 at commit [`7753998`](https://github.com/apache/spark/commit/7753998f0a21073a05897b8945c8e61a1fe4fc84).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17047: [SPARK-19720][CORE] Redact sensitive information from Sp...

Posted by markgrover <gi...@git.apache.org>.
Github user markgrover commented on the issue:

    https://github.com/apache/spark/pull/17047
  
    Seems unrelated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17047: [SPARK-19720][CORE] Redact sensitive information ...

Posted by markgrover <gi...@git.apache.org>.
Github user markgrover commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17047#discussion_r103585591
  
    --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala ---
    @@ -240,14 +240,16 @@ package object config {
         .longConf
         .createWithDefault(4 * 1024 * 1024)
     
    +  private[spark] val SECRET_REDACTION_PROPERTY = "spark.redaction.regex"
    --- End diff --
    
    Thanks @vanzin Updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17047: [SPARK-19720][CORE] Redact sensitive information ...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/17047


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17047: [SPARK-19720][CORE] Redact sensitive information from Sp...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/17047
  
    retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17047: [SPARK-19720][SPARK SUBMIT] Redact sensitive info...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17047#discussion_r103521931
  
    --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
    @@ -2574,13 +2575,30 @@ private[spark] object Utils extends Logging {
     
       def redact(conf: SparkConf, kvs: Seq[(String, String)]): Seq[(String, String)] = {
         val redactionPattern = conf.get(SECRET_REDACTION_PATTERN).r
    +    redact(redactionPattern, kvs)
    +  }
    +
    +  private def redact(redactionPattern: Regex, kvs: Seq[(String, String)]): Seq[(String, String)] = {
         kvs.map { kv =>
           redactionPattern.findFirstIn(kv._1)
             .map { ignore => (kv._1, REDACTION_REPLACEMENT_TEXT) }
    --- End diff --
    
    nit: s/`ignore`/`_`/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17047: [SPARK-19720][CORE] Redact sensitive information from Sp...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17047
  
    **[Test build #73625 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73625/testReport)** for PR 17047 at commit [`4edaf67`](https://github.com/apache/spark/commit/4edaf673b0c0b22dad9961a46f4aedabcadcd451).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17047: [SPARK-19720][CORE] Redact sensitive information ...

Posted by markgrover <gi...@git.apache.org>.
Github user markgrover commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17047#discussion_r103587323
  
    --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
    @@ -2574,13 +2575,31 @@ private[spark] object Utils extends Logging {
     
       def redact(conf: SparkConf, kvs: Seq[(String, String)]): Seq[(String, String)] = {
         val redactionPattern = conf.get(SECRET_REDACTION_PATTERN).r
    +    redact(redactionPattern, kvs)
    +  }
    +
    +  private def redact(redactionPattern: Regex, kvs: Seq[(String, String)]): Seq[(String, String)] = {
         kvs.map { kv =>
           redactionPattern.findFirstIn(kv._1)
    -        .map { ignore => (kv._1, REDACTION_REPLACEMENT_TEXT) }
    +        .map {ignore => (kv._1, REDACTION_REPLACEMENT_TEXT) }
    --- End diff --
    
    Ah, right I misunderstood that - I took your comment as a bash + scala way to say eliminate space (partly because it's hard to understand spacing in github comments). My bad, let me fix that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17047: [SPARK-19720][CORE] Redact sensitive information from Sp...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17047
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73626/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17047: [SPARK-19720][SPARK SUBMIT] Redact sensitive information...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17047
  
    **[Test build #73385 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73385/testReport)** for PR 17047 at commit [`000efb1`](https://github.com/apache/spark/commit/000efb1e3152f837e01ce1f80ae108c596f9baa5).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17047: [SPARK-19720][CORE] Redact sensitive information from Sp...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17047
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17047: [SPARK-19720][CORE] Redact sensitive information ...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17047#discussion_r103586865
  
    --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
    @@ -2574,13 +2575,31 @@ private[spark] object Utils extends Logging {
     
       def redact(conf: SparkConf, kvs: Seq[(String, String)]): Seq[(String, String)] = {
         val redactionPattern = conf.get(SECRET_REDACTION_PATTERN).r
    +    redact(redactionPattern, kvs)
    +  }
    +
    +  private def redact(redactionPattern: Regex, kvs: Seq[(String, String)]): Seq[(String, String)] = {
         kvs.map { kv =>
           redactionPattern.findFirstIn(kv._1)
    -        .map { ignore => (kv._1, REDACTION_REPLACEMENT_TEXT) }
    +        .map {ignore => (kv._1, REDACTION_REPLACEMENT_TEXT) }
    --- End diff --
    
    nit: replace "ignore" with "_" (in case my previous comment wasn't clear). also missing a space after '{'.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17047: [SPARK-19720][SPARK SUBMIT] Redact sensitive information...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17047
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73554/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17047: [SPARK-19720][SPARK SUBMIT] Redact sensitive info...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17047#discussion_r103069680
  
    --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
    @@ -2574,13 +2575,30 @@ private[spark] object Utils extends Logging {
     
       def redact(conf: SparkConf, kvs: Seq[(String, String)]): Seq[(String, String)] = {
         val redactionPattern = conf.get(SECRET_REDACTION_PATTERN).r
    +    redact(redactionPattern, kvs)
    +  }
    +
    +  private def redact(redactionPattern: Regex, kvs: Seq[(String, String)]): Seq[(String, String)] = {
         kvs.map { kv =>
           redactionPattern.findFirstIn(kv._1)
             .map { ignore => (kv._1, REDACTION_REPLACEMENT_TEXT) }
             .getOrElse(kv)
         }
       }
     
    +  /**
    +   * Looks up the redaction regex from within the key value pairs and uses it to redact the rest
    +   * of the key value pairs. No care is taken to make sure the redaction property itself is not
    +   * redacted. So theoretically, the property itself could be configured to redact its own value
    +   * when printing.
    +   * @param kvs
    +   * @return
    +   */
    +  def redact(kvs: Map[String, String]): Seq[(String, String)] = {
    --- End diff --
    
    (Nit: I'd omit param and return if they're not filled in.)
    So this is used in cases where there isn't a conf object available yet, but the argument itself has the redaction config? I was slightly worried about the parallel implementation but that would be a reasonable reason to do it. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17047: [SPARK-19720][CORE] Redact sensitive information from Sp...

Posted by markgrover <gi...@git.apache.org>.
Github user markgrover commented on the issue:

    https://github.com/apache/spark/pull/17047
  
    Jenkins, test this again, please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17047: [SPARK-19720][CORE] Redact sensitive information from Sp...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17047
  
    **[Test build #73651 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73651/testReport)** for PR 17047 at commit [`d6a04b9`](https://github.com/apache/spark/commit/d6a04b9c443b84c3bb33cfcc6b4a21423bef88fc).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17047: [SPARK-19720][SPARK SUBMIT] Redact sensitive info...

Posted by markgrover <gi...@git.apache.org>.
Github user markgrover commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17047#discussion_r103367049
  
    --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
    @@ -2574,13 +2575,30 @@ private[spark] object Utils extends Logging {
     
       def redact(conf: SparkConf, kvs: Seq[(String, String)]): Seq[(String, String)] = {
         val redactionPattern = conf.get(SECRET_REDACTION_PATTERN).r
    +    redact(redactionPattern, kvs)
    +  }
    +
    +  private def redact(redactionPattern: Regex, kvs: Seq[(String, String)]): Seq[(String, String)] = {
         kvs.map { kv =>
           redactionPattern.findFirstIn(kv._1)
             .map { ignore => (kv._1, REDACTION_REPLACEMENT_TEXT) }
             .getOrElse(kv)
         }
       }
     
    +  /**
    +   * Looks up the redaction regex from within the key value pairs and uses it to redact the rest
    +   * of the key value pairs. No care is taken to make sure the redaction property itself is not
    +   * redacted. So theoretically, the property itself could be configured to redact its own value
    +   * when printing.
    +   * @param kvs
    +   * @return
    +   */
    +  def redact(kvs: Map[String, String]): Seq[(String, String)] = {
    --- End diff --
    
    Correct, that's exactly the use case - where there isn't a conf object available yet. I will update the Javadoc. Thanks for reviewing!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17047: [SPARK-19720][CORE] Redact sensitive information from Sp...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17047
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17047: [SPARK-19720][CORE] Redact sensitive information from Sp...

Posted by markgrover <gi...@git.apache.org>.
Github user markgrover commented on the issue:

    https://github.com/apache/spark/pull/17047
  
    Thanks @vanzin 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17047: [SPARK-19720][CORE] Redact sensitive information from Sp...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/17047
  
    Merging to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17047: [SPARK-19720][SPARK SUBMIT] Redact sensitive information...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17047
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org