You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by witgo <gi...@git.apache.org> on 2014/04/05 16:23:58 UTC

[GitHub] spark pull request: remove scalalogging-slf4j dependency

GitHub user witgo opened a pull request:

    https://github.com/apache/spark/pull/332

    remove scalalogging-slf4j dependency

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/witgo/spark remove_scalalogging

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/332.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #332
    
----
commit eb93ee2e88cfe030da1eb2af01c5fb04ddb6d647
Author: witgo <wi...@qq.com>
Date:   2014-04-05T14:21:18Z

    remove scalalogging-slf4j dependency

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1470][SPARK-1842] Use the scala-logging...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/332#issuecomment-50962790
  
    QA results for PR 332:<br>- This patch PASSES unit tests.<br>- This patch merges cleanly<br>- This patch adds no public classes<br><br>For more information see test ouptut:<br>https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17767/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: Spark logger moving to use scala-logging

Posted by witgo <gi...@git.apache.org>.

Github user witgo commented on a diff in the pull request:

    https://github.com/apache/spark/pull/332#discussion_r11330987
  
    --- Diff: core/src/main/scala/org/apache/spark/Logging.scala ---
    @@ -30,65 +30,66 @@ trait Logging {
       // Make the log field transient so that objects with Logging can
       // be serialized and used on another machine
       @transient private var log_ : Logger = null
    +  @transient protected var logName_ : String = null
    --- End diff --
    
    Yes, I agree with your idea.Here method is better


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Spark logger moving to use scala-logging

Posted by pwendell <gi...@git.apache.org>.

Github user pwendell commented on the pull request:

    https://github.com/apache/spark/pull/332#issuecomment-39694787
  
    Jenkins, test this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Spark logger moving to use scala-logging

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/332#issuecomment-39695940
  
    
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13835/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: remove scalalogging-slf4j dependency

Posted by pwendell <gi...@git.apache.org>.

Github user pwendell commented on the pull request:

    https://github.com/apache/spark/pull/332#issuecomment-39656418
  
    My thought is - if it's only a matter of saving us from having to write 10 `if` statements (which I don't think have changed or been modified in several years), it might not be worth having an extra dependency to track.
    
    I'm guessing these if statements get optimized at runtime to the point where they are very cheap.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1470: Spark logger moving to use scala-l...

Posted by witgo <gi...@git.apache.org>.

Github user witgo commented on the pull request:

    https://github.com/apache/spark/pull/332#issuecomment-40269717
  
    Jenkins, test this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1470: Spark logger moving to use scala-l...

Posted by witgo <gi...@git.apache.org>.

Github user witgo commented on the pull request:

    https://github.com/apache/spark/pull/332#issuecomment-40221509
  
    How to let Jenkins to run the test?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: remove scalalogging-slf4j dependency

Posted by marmbrus <gi...@git.apache.org>.

Github user marmbrus commented on the pull request:

    https://github.com/apache/spark/pull/332#issuecomment-39675896
  
    Its not a matter of the if statements inside of spark logging code.  Scala logging rewrites your log statements to put the if checks _inline_ with _every_ logging statement.
    
    You write: `logger.debug(s"Log message $someExpensiveComputation")`
    You get:
    ```
    if(logger.debugEnabled) {
      val logMsg = "Log message" + someExpensiveComputation()
      logger.debug(logMessage)
    }
    ```
    
    Sparks code does something like this:
    ```
    val logMessage = new Function0() { def call() =  "Log message" + someExpensiveComputation() }
    log.debug(logMessage)
    ```
    The macros allow you to avoid both the function call, and possible gc overhead of allocating the closure in cases where the log does not fire.
    
    While not huge, this does make a performance difference in practice:
    ```
    std logging: 19885.48ms
    spark logging 914.408ms
    scala logging 729.779ms
    ```
    
    (And this was only a micro benchmark with no GC pressure).
    
    Given that this is library is a thin layer, endorsed by typesafe, that integrates nicely with our current logging framework (and is easily pluggable if we ever decide to change), and improves performance, I really think this PR is a step backwards.  If we want consistency we should probably instead get rid of Sparks home grown logging code.
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Spark logger moving to use scala-logging

Posted by aarondav <gi...@git.apache.org>.

Github user aarondav commented on a diff in the pull request:

    https://github.com/apache/spark/pull/332#discussion_r11331115
  
    --- Diff: core/src/main/scala/org/apache/spark/Logging.scala ---
    @@ -30,65 +30,66 @@ trait Logging {
       // Make the log field transient so that objects with Logging can
       // be serialized and used on another machine
       @transient private var log_ : Logger = null
    +  @transient protected var logName_ : String = null
     
       // Method to get or create the logger for this object
       protected def log: Logger = {
         if (log_ == null) {
           initializeIfNecessary()
    -      var className = this.getClass.getName
    +      var className = if(logName_ == null) this.getClass.getName else logName_
           // Ignore trailing $'s in the class names for Scala objects
           if (className.endsWith("$")) {
             className = className.substring(0, className.length - 1)
           }
    -      log_ = LoggerFactory.getLogger(className)
    +      log_ = Logger(LoggerFactory.getLogger(className))
         }
         log_
       }
     
       // Log methods that take only a String
       protected def logInfo(msg: => String) {
    -    if (log.isInfoEnabled) log.info(msg)
    +    log.info(msg)
       }
     
       protected def logDebug(msg: => String) {
    -    if (log.isDebugEnabled) log.debug(msg)
    +    log.debug(msg)
    --- End diff --
    
    Sorry, this is my bad, I failed to see that the scala logging wrapper we're using does this for us. Apologies!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1470] remove scalalogging-slf4j depende...

Posted by witgo <gi...@git.apache.org>.

GitHub user witgo reopened a pull request:

    https://github.com/apache/spark/pull/332

    [SPARK-1470] remove scalalogging-slf4j dependency

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/witgo/spark remove_scalalogging

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/332.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #332
    
----
commit dd95abada78b4d0aec97dacda50fdfd74464b073
Author: Reynold Xin <rx...@apache.org>
Date:   2014-07-15T08:46:57Z

    [SPARK-2399] Add support for LZ4 compression.
    
    Based on Greg Bowyer's patch from JIRA https://issues.apache.org/jira/browse/SPARK-2399
    
    Author: Reynold Xin <rx...@apache.org>
    
    Closes #1416 from rxin/lz4 and squashes the following commits:
    
    6c8fefe [Reynold Xin] Fixed typo.
    8a14d38 [Reynold Xin] [SPARK-2399] Add support for LZ4 compression.

commit 52beb20f7904e0333198b9b14619366ddf53ab85
Author: DB Tsai <db...@alpinenow.com>
Date:   2014-07-15T09:14:58Z

    [SPARK-2477][MLlib] Using appendBias for adding intercept in GeneralizedLinearAlgorithm
    
    Instead of using prependOne currently in GeneralizedLinearAlgorithm, we would like to use appendBias for 1) keeping the indices of original training set unchanged by adding the intercept into the last element of vector and 2) using the same public API for consistently adding intercept.
    
    Author: DB Tsai <db...@alpinenow.com>
    
    Closes #1410 from dbtsai/SPARK-2477_intercept_with_appendBias and squashes the following commits:
    
    011432c [DB Tsai] From Alpine Data Labs

commit 8f1d4226c285e33d2fb839d3163bb374eb6db0e7
Author: Reynold Xin <rx...@apache.org>
Date:   2014-07-15T09:15:29Z

    Update README.md to include a slightly more informative project description.
    
    (cherry picked from commit 401083be9f010f95110a819a49837ecae7d9c4ec)
    Signed-off-by: Reynold Xin <rx...@apache.org>

commit 6555618c8f39b4e7da9402c3fd9da7a75bf7794e
Author: Reynold Xin <rx...@apache.org>
Date:   2014-07-15T09:20:01Z

    README update: added "for Big Data".

commit 04b01bb101eeaf76c2e7c94c291669f0b2372c9a
Author: Alexander Ulanov <na...@yandex.ru>
Date:   2014-07-15T15:40:22Z

    [MLLIB] [SPARK-2222] Add multiclass evaluation metrics
    
    Adding two classes:
    1) MulticlassMetrics implements various multiclass evaluation metrics
    2) MulticlassMetricsSuite implements unit tests for MulticlassMetrics
    
    Author: Alexander Ulanov <na...@yandex.ru>
    Author: unknown <ul...@ULANOV1.emea.hpqcorp.net>
    Author: Xiangrui Meng <me...@databricks.com>
    
    Closes #1155 from avulanov/master and squashes the following commits:
    
    2eae80f [Alexander Ulanov] Merge pull request #1 from mengxr/avulanov-master
    5ebeb08 [Xiangrui Meng] minor updates
    79c3555 [Alexander Ulanov] Addressing reviewers comments mengxr
    0fa9511 [Alexander Ulanov] Addressing reviewers comments mengxr
    f0dadc9 [Alexander Ulanov] Addressing reviewers comments mengxr
    4811378 [Alexander Ulanov] Removing println
    87fb11f [Alexander Ulanov] Addressing reviewers comments mengxr. Added confusion matrix
    e3db569 [Alexander Ulanov] Addressing reviewers comments mengxr. Added true positive rate and false positive rate. Test suite code style.
    a7e8bf0 [Alexander Ulanov] Addressing reviewers comments mengxr
    c3a77ad [Alexander Ulanov] Addressing reviewers comments mengxr
    e2c91c3 [Alexander Ulanov] Fixes to mutliclass metics
    d5ce981 [unknown] Comments about Double
    a5c8ba4 [unknown] Unit tests. Class rename
    fcee82d [unknown] Unit tests. Class rename
    d535d62 [unknown] Multiclass evaluation

commit cb09e93c1d7ef9c8f0a1abe4e659783c74993a4e
Author: William Benton <wi...@redhat.com>
Date:   2014-07-15T16:13:39Z

    Reformat multi-line closure argument.
    
    Author: William Benton <wi...@redhat.com>
    
    Closes #1419 from willb/reformat-2486 and squashes the following commits:
    
    2676231 [William Benton] Reformat multi-line closure argument.

commit 9dd635eb5df52835b3b7f4f2b9c789da9e813c71
Author: witgo <wi...@qq.com>
Date:   2014-07-15T17:46:17Z

    SPARK-2480: Resolve sbt warnings "NOTE: SPARK_YARN is deprecated, please use -Pyarn flag"
    
    Author: witgo <wi...@qq.com>
    
    Closes #1404 from witgo/run-tests and squashes the following commits:
    
    f703aee [witgo] fix Note: implicit method fromPairDStream is not applicable here because it comes after the application point and it lacks an explicit result type
    2944f51 [witgo] Remove "NOTE: SPARK_YARN is deprecated, please use -Pyarn flag"
    ef59c70 [witgo] fix Note: implicit method fromPairDStream is not applicable here because it comes after the application point and it lacks an explicit result type
    6cefee5 [witgo] Remove "NOTE: SPARK_YARN is deprecated, please use -Pyarn flag"

commit 72ea56da8e383c61c6f18eeefef03b9af00f5158
Author: witgo <wi...@qq.com>
Date:   2014-07-15T18:52:56Z

    SPARK-1291: Link the spark UI to RM ui in yarn-client mode
    
    Author: witgo <wi...@qq.com>
    
    Closes #1112 from witgo/SPARK-1291 and squashes the following commits:
    
    6022bcd [witgo] review commit
    1fbb925 [witgo] add addAmIpFilter to yarn alpha
    210299c [witgo] review commit
    1b92a07 [witgo] review commit
    6896586 [witgo] Add comments to addWebUIFilter
    3e9630b [witgo] review commit
    142ee29 [witgo] review commit
    1fe7710 [witgo] Link the spark UI to RM ui in yarn-client mode

commit e7ec815d9a2b0f89a56dc7dd3106c31a09492028
Author: Reynold Xin <rx...@apache.org>
Date:   2014-07-15T20:13:33Z

    Added LZ4 to compression codec in configuration page.
    
    Author: Reynold Xin <rx...@apache.org>
    
    Closes #1417 from rxin/lz4 and squashes the following commits:
    
    472f6a1 [Reynold Xin] Set the proper default.
    9cf0b2f [Reynold Xin] Added LZ4 to compression codec in configuration page.

commit a21f9a7543309320bb2791468243c8f10bc6e81b
Author: Xiangrui Meng <me...@databricks.com>
Date:   2014-07-15T21:00:54Z

    [SPARK-2471] remove runtime scope for jets3t
    
    The assembly jar (built by sbt) doesn't include jets3t if we set it to runtime only, but I don't know whether it was set this way for a particular reason.
    
    CC: srowen ScrapCodes
    
    Author: Xiangrui Meng <me...@databricks.com>
    
    Closes #1402 from mengxr/jets3t and squashes the following commits:
    
    bfa2d17 [Xiangrui Meng] remove runtime scope for jets3t

commit 0f98ef1a2c9ecf328f6c5918808fa5ca486e8afd
Author: Michael Armbrust <mi...@databricks.com>
Date:   2014-07-15T21:01:48Z

    [SPARK-2483][SQL] Fix parsing of repeated, nested data access.
    
    Author: Michael Armbrust <mi...@databricks.com>
    
    Closes #1411 from marmbrus/nestedRepeated and squashes the following commits:
    
    044fa09 [Michael Armbrust] Fix parsing of repeated, nested data access.

commit bcd0c30c7eea4c50301cb732c733fdf4d4142060
Author: Michael Armbrust <mi...@databricks.com>
Date:   2014-07-15T21:04:01Z

    [SQL] Whitelist more Hive tests.
    
    Author: Michael Armbrust <mi...@databricks.com>
    
    Closes #1396 from marmbrus/moreTests and squashes the following commits:
    
    6660b60 [Michael Armbrust] Blacklist a test that requires DFS command.
    8b6001c [Michael Armbrust] Add golden files.
    ccd8f97 [Michael Armbrust] Whitelist more tests.

commit 8af46d58464b96471825ce376c3e11c8b1108c0e
Author: Yin Huai <hu...@cse.ohio-state.edu>
Date:   2014-07-15T21:06:45Z

    [SPARK-2474][SQL] For a registered table in OverrideCatalog, the Analyzer failed to resolve references in the format of "tableName.fieldName"
    
    Please refer to JIRA (https://issues.apache.org/jira/browse/SPARK-2474) for how to reproduce the problem and my understanding of the root cause.
    
    Author: Yin Huai <hu...@cse.ohio-state.edu>
    
    Closes #1406 from yhuai/SPARK-2474 and squashes the following commits:
    
    96b1627 [Yin Huai] Merge remote-tracking branch 'upstream/master' into SPARK-2474
    af36d65 [Yin Huai] Fix comment.
    be86ba9 [Yin Huai] Correct SQL console settings.
    c43ad00 [Yin Huai] Wrap the relation in a Subquery named by the table name in OverrideCatalog.lookupRelation.
    a5c2145 [Yin Huai] Support sql/console.

commit 61de65bc69f9a5fc396b76713193c6415436d452
Author: William Benton <wi...@redhat.com>
Date:   2014-07-15T21:11:57Z

    SPARK-2407: Added internal implementation of SQL SUBSTR()
    
    This replaces the Hive UDF for SUBSTR(ING) with an implementation in Catalyst
    and adds tests to verify correct operation.
    
    Author: William Benton <wi...@redhat.com>
    
    Closes #1359 from willb/internalSqlSubstring and squashes the following commits:
    
    ccedc47 [William Benton] Fixed too-long line.
    a30a037 [William Benton] replace view bounds with implicit parameters
    ec35c80 [William Benton] Adds fixes from review:
    4f3bfdb [William Benton] Added internal implementation of SQL SUBSTR()

commit 502f90782ad474e2630ed5be4d3c4be7dab09c34
Author: Michael Armbrust <mi...@databricks.com>
Date:   2014-07-16T00:56:17Z

    [SQL] Attribute equality comparisons should be done by exprId.
    
    Author: Michael Armbrust <mi...@databricks.com>
    
    Closes #1414 from marmbrus/exprIdResolution and squashes the following commits:
    
    97b47bc [Michael Armbrust] Attribute equality comparisons should be done by exprId.

commit c2048a5165b270f5baf2003fdfef7bc6c5875715
Author: Zongheng Yang <zo...@gmail.com>
Date:   2014-07-16T00:58:28Z

    [SPARK-2498] [SQL] Synchronize on a lock when using scala reflection inside data type objects.
    
    JIRA ticket: https://issues.apache.org/jira/browse/SPARK-2498
    
    Author: Zongheng Yang <zo...@gmail.com>
    
    Closes #1423 from concretevitamin/scala-ref-catalyst and squashes the following commits:
    
    325a149 [Zongheng Yang] Synchronize on a lock when initializing data type objects in Catalyst.

commit 4576d80a5155c9fbfebe9c36cca06c208bca5bd3
Author: Reynold Xin <rx...@apache.org>
Date:   2014-07-16T01:47:39Z

    [SPARK-2469] Use Snappy (instead of LZF) for default shuffle compression codec
    
    This reduces shuffle compression memory usage by 3x.
    
    Author: Reynold Xin <rx...@apache.org>
    
    Closes #1415 from rxin/snappy and squashes the following commits:
    
    06c1a01 [Reynold Xin] SPARK-2469: Use Snappy (instead of LZF) for default shuffle compression codec.

commit 9c12de5092312319aa22f24df47a6de0e41a0102
Author: Henry Saputra <he...@gmail.com>
Date:   2014-07-16T04:21:52Z

    [SPARK-2500] Move the logInfo for registering BlockManager to BlockManagerMasterActor.register method
    
    PR for SPARK-2500
    
    Move the logInfo call for BlockManager to BlockManagerMasterActor.register instead of BlockManagerInfo constructor.
    
    Previously the loginfo call for registering the registering a BlockManager is happening in the BlockManagerInfo constructor. This kind of confusing because the code could call "new BlockManagerInfo" without actually registering a BlockManager and could confuse when reading the log files.
    
    Author: Henry Saputra <he...@gmail.com>
    
    Closes #1424 from hsaputra/move_registerblockmanager_log_to_registration_method and squashes the following commits:
    
    3370b4a [Henry Saputra] Move the loginfo for BlockManager to BlockManagerMasterActor.register instead of BlockManagerInfo constructor.

commit 563acf5edfbfb2fa756a1f0accde0940592663e9
Author: Ken Takagiwa <ke...@kens-macbook-pro.local>
Date:   2014-07-16T04:34:05Z

    follow pep8 None should be compared using is or is not
    
    http://legacy.python.org/dev/peps/pep-0008/
    ## Programming Recommendations
    - Comparisons to singletons like None should always be done with is or is not, never the equality operators.
    
    Author: Ken Takagiwa <ke...@Kens-MacBook-Pro.local>
    
    Closes #1422 from giwa/apache_master and squashes the following commits:
    
    7b361f3 [Ken Takagiwa] follow pep8 None should be checked using is or is not

commit 90ca532a0fd95dc85cff8c5722d371e8368b2687
Author: Aaron Staple <aa...@gmail.com>
Date:   2014-07-16T04:35:36Z

    [SPARK-2314][SQL] Override collect and take in JavaSchemaRDD, forwarding to SchemaRDD implementations.
    
    Author: Aaron Staple <aa...@gmail.com>
    
    Closes #1421 from staple/SPARK-2314 and squashes the following commits:
    
    73e04dc [Aaron Staple] [SPARK-2314] Override collect and take in JavaSchemaRDD, forwarding to SchemaRDD implementations.

commit 9b38b7c71352bb5e6d359515111ad9ca33299127
Author: Takuya UESHIN <ue...@happy-camper.st>
Date:   2014-07-16T05:35:34Z

    [SPARK-2509][SQL] Add optimization for Substring.
    
    `Substring` including `null` literal cases could be added to `NullPropagation`.
    
    Author: Takuya UESHIN <ue...@happy-camper.st>
    
    Closes #1428 from ueshin/issues/SPARK-2509 and squashes the following commits:
    
    d9eb85f [Takuya UESHIN] Add Substring cases to NullPropagation.

commit 632fb3d9a9ebb3d2218385403145d5b89c41c025
Author: Takuya UESHIN <ue...@happy-camper.st>
Date:   2014-07-16T05:43:48Z

    [SPARK-2504][SQL] Fix nullability of Substring expression.
    
    This is a follow-up of #1359 with nullability narrowing.
    
    Author: Takuya UESHIN <ue...@happy-camper.st>
    
    Closes #1426 from ueshin/issues/SPARK-2504 and squashes the following commits:
    
    5157832 [Takuya UESHIN] Remove unnecessary white spaces.
    80958ac [Takuya UESHIN] Fix nullability of Substring expression.

commit efc452a16322e8b20b3c4fe1d6847315f928cd2d
Author: Cheng Lian <li...@gmail.com>
Date:   2014-07-16T16:44:51Z

    [SPARK-2119][SQL] Improved Parquet performance when reading off S3
    
    JIRA issue: [SPARK-2119](https://issues.apache.org/jira/browse/SPARK-2119)
    
    Essentially this PR fixed three issues to gain much better performance when reading large Parquet file off S3.
    
    1. When reading the schema, fetching Parquet metadata from a part-file rather than the `_metadata` file
    
       The `_metadata` file contains metadata of all row groups, and can be very large if there are many row groups. Since schema information and row group metadata are coupled within a single Thrift object, we have to read the whole `_metadata` to fetch the schema. On the other hand, schema is replicated among footers of all part-files, which are fairly small.
    
    1. Only add the root directory of the Parquet file rather than all the part-files to input paths
    
       HDFS API can automatically filter out all hidden files and underscore files (`_SUCCESS` & `_metadata`), there's no need to filter out all part-files and add them individually to input paths. What make it much worse is that, `FileInputFormat.listStatus()` calls `FileSystem.globStatus()` on each individual input path sequentially, each results a blocking remote S3 HTTP request.
    
    1. Worked around [PARQUET-16](https://issues.apache.org/jira/browse/PARQUET-16)
    
       Essentially PARQUET-16 is similar to the above issue, and results lots of sequential `FileSystem.getFileStatus()` calls, which are further translated into a bunch of remote S3 HTTP requests.
    
       `FilteringParquetRowInputFormat` should be cleaned up once PARQUET-16 is fixed.
    
    Below is the micro benchmark result. The dataset used is a S3 Parquet file consists of 3,793 partitions, about 110MB per partition in average. The benchmark is done with a 9-node AWS cluster.
    
    - Creating a Parquet `SchemaRDD` (Parquet schema is fetched)
    
      ```scala
      val tweets = parquetFile(uri)
      ```
    
      - Before: 17.80s
      - After: 8.61s
    
    - Fetching partition information
    
      ```scala
      tweets.getPartitions
      ```
    
      - Before: 700.87s
      - After: 21.47s
    
    - Counting the whole file (both steps above are executed altogether)
    
      ```scala
      parquetFile(uri).count()
      ```
    
      - Before: ??? (haven't test yet)
      - After: 53.26s
    
    Author: Cheng Lian <li...@gmail.com>
    
    Closes #1370 from liancheng/faster-parquet and squashes the following commits:
    
    94a2821 [Cheng Lian] Added comments about schema consistency
    d2c4417 [Cheng Lian] Worked around PARQUET-16 to improve Parquet performance
    1c0d1b9 [Cheng Lian] Accelerated Parquet schema retrieving
    5bd3d29 [Cheng Lian] Fixed Parquet log level

commit 33e64ecacbc44567f9cba2644a30a118653ea5fa
Author: Rui Li <ru...@intel.com>
Date:   2014-07-16T17:23:37Z

    SPARK-2277: make TaskScheduler track hosts on rack
    
    Hi mateiz, I've created [SPARK-2277](https://issues.apache.org/jira/browse/SPARK-2277) to make TaskScheduler track hosts on each rack. Please help to review, thanks.
    
    Author: Rui Li <ru...@intel.com>
    
    Closes #1212 from lirui-intel/trackHostOnRack and squashes the following commits:
    
    2b4bd0f [Rui Li] SPARK-2277: refine UT
    fbde838 [Rui Li] SPARK-2277: add UT
    7bbe658 [Rui Li] SPARK-2277: rename the method
    5e4ef62 [Rui Li] SPARK-2277: remove unnecessary import
    79ac750 [Rui Li] SPARK-2277: make TaskScheduler track hosts on rack

commit efe2a8b1262a371471f52ca7d47dc34789e80558
Author: Reynold Xin <rx...@apache.org>
Date:   2014-07-16T17:44:54Z

    Tightening visibility for various Broadcast related classes.
    
    In preparation for SPARK-2521.
    
    Author: Reynold Xin <rx...@apache.org>
    
    Closes #1438 from rxin/broadcast and squashes the following commits:
    
    432f1cc [Reynold Xin] Tightening visibility for various Broadcast related classes.

commit df95d82da7c76c074fd4064f7c870d55d99e0d8e
Author: Yin Huai <hu...@cse.ohio-state.edu>
Date:   2014-07-16T17:53:59Z

    [SPARK-2525][SQL] Remove as many compilation warning messages as possible in Spark SQL
    
    JIRA: https://issues.apache.org/jira/browse/SPARK-2525.
    
    Author: Yin Huai <hu...@cse.ohio-state.edu>
    
    Closes #1444 from yhuai/SPARK-2517 and squashes the following commits:
    
    edbac3f [Yin Huai] Removed some compiler type erasure warnings.

commit 1c5739f68510c2336bf6cb3e18aea03d85988bfb
Author: Reynold Xin <rx...@apache.org>
Date:   2014-07-16T17:55:47Z

    [SQL] Cleaned up ConstantFolding slightly.
    
    Moved couple rules out of NullPropagation and added more comments.
    
    Author: Reynold Xin <rx...@apache.org>
    
    Closes #1430 from rxin/sql-folding-rule and squashes the following commits:
    
    7f9a197 [Reynold Xin] Updated documentation for ConstantFolding.
    7f8cf61 [Reynold Xin] [SQL] Cleaned up ConstantFolding slightly.

commit fc7edc9e76f97b25e456ae7b72ef8636656f4f1a
Author: Sandy Ryza <sa...@cloudera.com>
Date:   2014-07-16T18:07:16Z

    SPARK-2519. Eliminate pattern-matching on Tuple2 in performance-critical...
    
    ... aggregation code
    
    Author: Sandy Ryza <sa...@cloudera.com>
    
    Closes #1435 from sryza/sandy-spark-2519 and squashes the following commits:
    
    640706a [Sandy Ryza] SPARK-2519. Eliminate pattern-matching on Tuple2 in performance-critical aggregation code

commit cc965eea510397642830acb21f61127b68c098d6
Author: Takuya UESHIN <ue...@happy-camper.st>
Date:   2014-07-16T18:13:38Z

    [SPARK-2518][SQL] Fix foldability of Substring expression.
    
    This is a follow-up of #1428.
    
    Author: Takuya UESHIN <ue...@happy-camper.st>
    
    Closes #1432 from ueshin/issues/SPARK-2518 and squashes the following commits:
    
    37d1ace [Takuya UESHIN] Fix foldability of Substring expression.

commit ef48222c10be3d29a83dfc2329f455eba203cd38
Author: Reynold Xin <rx...@apache.org>
Date:   2014-07-16T18:15:07Z

    [SPARK-2517] Remove some compiler warnings.
    
    Author: Reynold Xin <rx...@apache.org>
    
    Closes #1433 from rxin/compile-warning and squashes the following commits:
    
    8d0b890 [Reynold Xin] Remove some compiler warnings.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-1470] Use the scala-logging wrapper ins...

Posted by witgo <gi...@git.apache.org>.

Github user witgo closed the pull request at:

    https://github.com/apache/spark/pull/332


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Spark logger moving to use scala-logging

Posted by aarondav <gi...@git.apache.org>.

Github user aarondav commented on a diff in the pull request:

    https://github.com/apache/spark/pull/332#discussion_r11331064
  
    --- Diff: core/src/main/scala/org/apache/spark/Logging.scala ---
    @@ -30,65 +30,66 @@ trait Logging {
       // Make the log field transient so that objects with Logging can
       // be serialized and used on another machine
       @transient private var log_ : Logger = null
    +  @transient protected var logName_ : String = null
     
       // Method to get or create the logger for this object
       protected def log: Logger = {
         if (log_ == null) {
           initializeIfNecessary()
    -      var className = this.getClass.getName
    +      var className = if(logName_ == null) this.getClass.getName else logName_
           // Ignore trailing $'s in the class names for Scala objects
           if (className.endsWith("$")) {
             className = className.substring(0, className.length - 1)
           }
    -      log_ = LoggerFactory.getLogger(className)
    +      log_ = Logger(LoggerFactory.getLogger(className))
         }
         log_
       }
     
       // Log methods that take only a String
       protected def logInfo(msg: => String) {
    -    if (log.isInfoEnabled) log.info(msg)
    +    log.info(msg)
       }
     
       protected def logDebug(msg: => String) {
    -    if (log.isDebugEnabled) log.debug(msg)
    +    log.debug(msg)
    --- End diff --
    
    This could cause a significant performance hit. Previously, we would not evaluate the "msg" function unless the specified level was enabled, which allows us to avoid string building and calling various toString methods when the log messages won't be seen. Can we re-add these checks using `log.underlying`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Spark logger moving to use scala-logging

Posted by rxin <gi...@git.apache.org>.

Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/332#issuecomment-39700055
  
    Is performance really a problem here? I don't think we log stuff in critical path (we probably shouldn't). If we do, maybe we can just get rid of those logging.
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: remove scalalogging-slf4j dependency

Posted by marmbrus <gi...@git.apache.org>.

Github user marmbrus commented on the pull request:

    https://github.com/apache/spark/pull/332#issuecomment-39655093
  
    Scala logging is not a new logger.  This is just a set of macro functions that call sfl4j, wrapping the logging calls in checks to see if they are enabled.  This lets you perform logging without the performance overhead of doing string concatenation when the log doesn't, allocating a closure (like current spark logging does), or having to manually write out the if checks yourself.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1470][SPARK-1842] Use the scala-logging...

Posted by witgo <gi...@git.apache.org>.

Github user witgo commented on the pull request:

    https://github.com/apache/spark/pull/332#issuecomment-50961153
  
    Jenkins, test this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: SPARK-1470: Spark logger moving to use scala-l...

Posted by witgo <gi...@git.apache.org>.

Github user witgo commented on the pull request:

    https://github.com/apache/spark/pull/332#issuecomment-40863767
  
    @marmbrus 
    You are right ,macros is faster than the inline method. There's a short gap.
    
    the [test code](https://github.com/witgo/spark/blob/logger/core/src/test/scala/org/apache/spark/LoggingSuite.scala)
    the results =>
    ```
    1
    logString: 38987.745 ms
    logInline: 14.198 ms
    logFunction0: 16.22 ms
    logMacros: 12.312 ms
    
    2
    logString: 39037.412 ms
    logInline: 14.349 ms
    logFunction0: 16.2 ms
    logMacros: 12.249 ms
    
    3
    logString: 39169.036 ms
    logInline: 14.324 ms
    logFunction0: 16.871 ms
    logMacros: 11.324 ms
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1470: Spark logger moving to use scala-l...

Posted by witgo <gi...@git.apache.org>.

Github user witgo commented on the pull request:

    https://github.com/apache/spark/pull/332#issuecomment-40626222
  
    I don't think macros is faster than the inline method
    ```scala
    import scala.compat.Platform
    
    class Log(val flag: Boolean) {
    
      def logFunction0(msg: => String) {
        if (flag)
          println(msg)
      }
    
      def logString(msg: String) {
        if (flag)
          println(msg)
      }
    
      @inline
      def logInline(msg: => String) {
        if (flag)
          println(msg)
      }
    }
    
    val log = new Log(false)
    var startTime = Platform.currentTime
    var i = 0
    while (i < 1000000) {
      log.logInline("" + Thread.currentThread.getStackTrace().
        toSeq.take(10).mkString("\\n"))
      i += 1
    }
    var stopTime = Platform.currentTime
    Platform.collectGarbage
    println("logInline: " + (stopTime - startTime) + " ms")
    
    startTime = Platform.currentTime
    i = 0
    while (i < 1000000) {
      log.logFunction0("" + Thread.currentThread.getStackTrace().
        toSeq.take(10).mkString("\\n"))
      i += 1
    }
    stopTime = Platform.currentTime
    Platform.collectGarbage
    println("logFunction0: " + (stopTime - startTime) + " ms")
    
    startTime = Platform.currentTime
    i = 0
    while (i < 1000000) {
      log.logString("" + Thread.currentThread.getStackTrace().
        toSeq.take(10).mkString("\\n"))
      i += 1
    }
    stopTime = Platform.currentTime
    Platform.collectGarbage
    println("logString: " + (stopTime - startTime) + " ms")
    ````
        
    =>
    
    1.
    
    	logInline: 9 ms
    	logFunction0: 14 ms
    	logString: 29114 ms
    
    2.
    
    	logInline: 10 ms
    	logFunction0: 13 ms
    	logString: 28876 ms
    3.
    
    	logInline: 9 ms
    	logFunction0: 13 ms
    	logString: 28521 ms


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1470: Spark logger moving to use scala-l...

Posted by marmbrus <gi...@git.apache.org>.

Github user marmbrus commented on the pull request:

    https://github.com/apache/spark/pull/332#issuecomment-40646342
  
    It doesn't appear that your code is actually using macros, so I'm not sure what the comparison is here.  I'd also caution against reading to much into simple micro benchmarks.  Its entirely possible that the JVM is statically eliminating the if statement or making other simplifications when it JITs this code, so you might just be measuring the time to add an integer.  Also, for benchmarks you should [use nanoTime not currentTime](http://stackoverflow.com/questions/351565/system-currenttimemillis-vs-system-nanotime).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: remove scalalogging-slf4j dependency

Posted by pwendell <gi...@git.apache.org>.

Github user pwendell commented on the pull request:

    https://github.com/apache/spark/pull/332#issuecomment-39652335
  
    /cc @marumbrus - do you have strong attachments to scala logging? For the short term I do agree consolidating on Spark's existing logging would be nice. In particular, downstream consumers of Spark's logging subsystem do crazy stuff like routing to different sfl4j backends or inserting other log bridges and interceptors (e.g. http://apache-spark-developers-list.1001551.n3.nabble.com/0-9-0-forces-log4j-usage-td532.html)
    
    Adding another logger into the mix seems like it could be risky. Or at least, it might be nice to merge this now and then separately decide if we want Spark's overall logging system to migrate to scalalogging.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1470: Spark logger moving to use scala-l...

Posted by rxin <gi...@git.apache.org>.

Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/332#issuecomment-40295844
  
    We should wait until questions like these are answered before we move to scala-logging. 
    
    https://github.com/typesafehub/scala-logging/issues/4
    
    https://github.com/typesafehub/scalalogging/issues/42
    
    At the very least, when we decide to move, we should make sure we can switch off it easily in case we found (blocker level) problems with it.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: remove scalalogging-slf4j dependency

Posted by pwendell <gi...@git.apache.org>.

Github user pwendell commented on the pull request:

    https://github.com/apache/spark/pull/332#issuecomment-39677846
  
    @witgo could you take a crack at doing that? I think it would be a fairly small change to `Logging.scala` where we just use the scala-logging wrapper instead of the directly sfl4j api


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Spark logger moving to use scala-logging

Posted by witgo <gi...@git.apache.org>.

Github user witgo commented on the pull request:

    https://github.com/apache/spark/pull/332#issuecomment-39701040
  
    I did not find the call that affect performance
    It is possible that here:
    Spark Catalyst `logger.debug`  is called many times
    May be like this
    [RuleExecutor.scala#L64](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala#L64)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Spark logger moving to use scala-logging

Posted by pwendell <gi...@git.apache.org>.

Github user pwendell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/332#discussion_r11330919
  
    --- Diff: core/src/main/scala/org/apache/spark/Logging.scala ---
    @@ -30,65 +30,66 @@ trait Logging {
       // Make the log field transient so that objects with Logging can
       // be serialized and used on another machine
       @transient private var log_ : Logger = null
    +  @transient protected var logName_ : String = null
    --- End diff --
    
    It might be cleaner to do this:
    
    ```
    protected def logName = {
      var className = this.getClass.getName
      // Ignore trailing $'s in the class names for Scala objects
      if (className.endsWith("$")) {
        className = className.substring(0, className.length - 1)
       }
       className
    }
    ```
    
    Then let subclasses override that function. Then below you could just have:
    ```
    log_ = Logger(LoggerFactory.getLogger(logName))
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Spark logger moving to use scala-logging

Posted by aarondav <gi...@git.apache.org>.

Github user aarondav commented on a diff in the pull request:

    https://github.com/apache/spark/pull/332#discussion_r11331098
  
    --- Diff: project/SparkBuild.scala ---
    @@ -315,36 +315,37 @@ object SparkBuild extends Build {
       def coreSettings = sharedSettings ++ Seq(
         name := "spark-core",
         libraryDependencies ++= Seq(
    -        "com.google.guava"           % "guava"            % "14.0.1",
    -        "com.google.code.findbugs"   % "jsr305"           % "1.3.9",
    -        "log4j"                      % "log4j"            % "1.2.17",
    -        "org.slf4j"                  % "slf4j-api"        % slf4jVersion,
    -        "org.slf4j"                  % "slf4j-log4j12"    % slf4jVersion,
    -        "org.slf4j"                  % "jul-to-slf4j"     % slf4jVersion,
    -        "org.slf4j"                  % "jcl-over-slf4j"   % slf4jVersion,
    -        "commons-daemon"             % "commons-daemon"   % "1.0.10", // workaround for bug HADOOP-9407
    -        "com.ning"                   % "compress-lzf"     % "1.0.0",
    -        "org.xerial.snappy"          % "snappy-java"      % "1.0.5",
    -        "org.spark-project.akka"    %% "akka-remote"      % akkaVersion excludeAll(excludeNetty),
    -        "org.spark-project.akka"    %% "akka-slf4j"       % akkaVersion excludeAll(excludeNetty),
    -        "org.spark-project.akka"    %% "akka-testkit"     % akkaVersion % "test",
    -        "org.json4s"                %% "json4s-jackson"   % "3.2.6" excludeAll(excludeScalap),
    -        "it.unimi.dsi"               % "fastutil"         % "6.4.4",
    -        "colt"                       % "colt"             % "1.2.0",
    -        "org.apache.mesos"           % "mesos"            % "0.13.0",
    -        "commons-net"                % "commons-net"      % "2.2",
    -        "net.java.dev.jets3t"        % "jets3t"           % "0.7.1" excludeAll(excludeCommonsLogging),
    -        "org.apache.derby"           % "derby"            % "10.4.2.0"                     % "test",
    -        "org.apache.hadoop"          % hadoopClient       % hadoopVersion excludeAll(excludeNetty, excludeAsm, excludeCommonsLogging, excludeSLF4J, excludeOldAsm),
    -        "org.apache.curator"         % "curator-recipes"  % "2.4.0" excludeAll(excludeNetty),
    -        "com.codahale.metrics"       % "metrics-core"     % codahaleMetricsVersion,
    -        "com.codahale.metrics"       % "metrics-jvm"      % codahaleMetricsVersion,
    -        "com.codahale.metrics"       % "metrics-json"     % codahaleMetricsVersion,
    -        "com.codahale.metrics"       % "metrics-graphite" % codahaleMetricsVersion,
    -        "com.twitter"               %% "chill"            % chillVersion excludeAll(excludeAsm),
    -        "com.twitter"                % "chill-java"       % chillVersion excludeAll(excludeAsm),
    -        "org.tachyonproject"         % "tachyon"          % "0.4.1-thrift" excludeAll(excludeHadoop, excludeCurator, excludeEclipseJetty, excludePowermock),
    -        "com.clearspring.analytics"  % "stream"           % "2.5.1"
    +        "com.google.guava"           % "guava"               % "14.0.1",
    +        "com.google.code.findbugs"   % "jsr305"              % "1.3.9",
    +        "log4j"                      % "log4j"               % "1.2.17",
    +        "org.slf4j"                  % "slf4j-api"           % slf4jVersion,
    +        "org.slf4j"                  % "slf4j-log4j12"       % slf4jVersion,
    +        "org.slf4j"                  % "jul-to-slf4j"        % slf4jVersion,
    +        "org.slf4j"                  % "jcl-over-slf4j"      % slf4jVersion,
    +        "com.typesafe"               %% "scalalogging-slf4j" % "1.0.1",
    --- End diff --
    
    I don't think it's really worth the diff changes to re-align the %s for this longer one, but if you're intent on keeping it we should follow the style -- remove one space before the %%


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Spark logger moving to use scala-logging

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/332#issuecomment-39694872
  
    Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: remove scalalogging-slf4j dependency

Posted by pwendell <gi...@git.apache.org>.

Github user pwendell commented on the pull request:

    https://github.com/apache/spark/pull/332#issuecomment-39645375
  
    @witgo why do you want to remove this? I agree it would be good to consolidate the logging infrastructure (e.g. not use different logging for catalyst)... but is there a specific issue with scala logging"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Spark logger moving to use scala-logging

Posted by aarondav <gi...@git.apache.org>.

Github user aarondav commented on the pull request:

    https://github.com/apache/spark/pull/332#issuecomment-39695999
  
    @marmbrus As it stands, this patch seems to give us both worlds (unnecessarily): we wrap the logging statements in a closure and then use the scala logging macro to determine if we should unwrap. However, if we were to simply make the logInfo() arguments into plain Strings, wouldn't we also lose the benefits of the Scala macro, since the string has already been materialized as the argument of logInfo? Would adding an `@inline` annotation avoid this or would we have to make our log* functions into macros themselves to avoid this overhead while keeping the name?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Spark logger moving to use scala-logging

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/332#issuecomment-39695939
  
    Merged build finished. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1470][SPARK-1842] Use the scala-logging...

Posted by witgo <gi...@git.apache.org>.

Github user witgo closed the pull request at:

    https://github.com/apache/spark/pull/332


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: SPARK-1470: Spark logger moving to use scala-l...

Posted by witgo <gi...@git.apache.org>.

Github user witgo commented on the pull request:

    https://github.com/apache/spark/pull/332#issuecomment-40625682
  
    I don't think macros is faster than the inline method
    
    ```scala
    import scala.compat.Platform
    import language.experimental.macros
    import scala.reflect.macros.Context
    
    
    class Log(val flag: Boolean) {
    
      def logFunction0(msg: => String) {
        if (flag)
          println(msg)
      }
    
      def logString(msg: String) {
        if (flag)
          println(msg)
      }
    
      @inline
      def logInline(msg: => String) {
        if (flag)
          println(msg)
      }
    }
    
    val log = new Log(false)
    var startTime = Platform.currentTime
    var i = 0
    while (i < 1000000) {
      log.logInline("" + Thread.currentThread.getStackTrace().
        toSeq.take(10).mkString("\\n"))
      i += 1
    }
    var stopTime = Platform.currentTime
    Platform.collectGarbage
    println("logInline: " + (stopTime - startTime) + " ms")
    
    startTime = Platform.currentTime
    i = 0
    while (i < 1000000) {
      log.logFunction0("" + Thread.currentThread.getStackTrace().
        toSeq.take(10).mkString("\\n"))
      i += 1
    }
    stopTime = Platform.currentTime
    Platform.collectGarbage
    println("logFunction0: " + (stopTime - startTime) + " ms")
    
    startTime = Platform.currentTime
    i = 0
    while (i < 1000000) {
      log.logString("" + Thread.currentThread.getStackTrace().
        toSeq.take(10).mkString("\\n"))
      i += 1
    }
    stopTime = Platform.currentTime
    Platform.collectGarbage
    println("logString: " + (stopTime - startTime) + " ms")
    ````
    
    =>
    
    1.
    
    	logInline: 9 ms
    	logFunction0: 14 ms
    	logString: 29114 ms
    
    2.
    
    	logInline: 10 ms
    	logFunction0: 13 ms
    	logString: 28876 ms
    3.
    
    	logInline: 9 ms
    	logFunction0: 13 ms
    	logString: 28521 ms


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Spark logger moving to use scala-logging

Posted by pwendell <gi...@git.apache.org>.

Github user pwendell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/332#discussion_r11330926
  
    --- Diff: core/src/main/scala/org/apache/spark/Logging.scala ---
    @@ -30,65 +30,66 @@ trait Logging {
       // Make the log field transient so that objects with Logging can
       // be serialized and used on another machine
       @transient private var log_ : Logger = null
    +  @transient protected var logName_ : String = null
     
       // Method to get or create the logger for this object
       protected def log: Logger = {
         if (log_ == null) {
           initializeIfNecessary()
    -      var className = this.getClass.getName
    +      var className = if(logName_ == null) this.getClass.getName else logName_
           // Ignore trailing $'s in the class names for Scala objects
           if (className.endsWith("$")) {
    --- End diff --
    
    This is a bit confusing now because you have this thing called `className` but it may-or-may-not refer to a class. See the note above.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1470] Spark logger moving to use scala-...

Posted by aarondav <gi...@git.apache.org>.

Github user aarondav commented on a diff in the pull request:

    https://github.com/apache/spark/pull/332#discussion_r12509814
  
    --- Diff: project/SparkBuild.scala ---
    @@ -317,6 +317,7 @@ object SparkBuild extends Build {
       val excludeFastutil = ExclusionRule(organization = "it.unimi.dsi")
       val excludeJruby = ExclusionRule(organization = "org.jruby")
       val excludeThrift = ExclusionRule(organization = "org.apache.thrift")
    +  val excludeScalalogging= ExclusionRule(organization = "com.typesafe", artifact = "scalalogging-slf4j")
    --- End diff --
    
    nit: space before =. Also, does this exclusion rule work despite having only an approximate artifact id?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1470] Spark logger moving to use scala-...

Posted by witgo <gi...@git.apache.org>.

Github user witgo commented on the pull request:

    https://github.com/apache/spark/pull/332#issuecomment-42771828
  
    @pwendell @marmbrus  @rxin 
    typesafehub/scala-logging#4 has been solved by typesafehub/scala-logging#15
    The PR should be able to merge into master master



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Spark logger moving to use scala-logging

Posted by aarondav <gi...@git.apache.org>.

Github user aarondav commented on a diff in the pull request:

    https://github.com/apache/spark/pull/332#discussion_r11331104
  
    --- Diff: project/SparkBuild.scala ---
    @@ -315,36 +315,37 @@ object SparkBuild extends Build {
       def coreSettings = sharedSettings ++ Seq(
         name := "spark-core",
         libraryDependencies ++= Seq(
    -        "com.google.guava"           % "guava"            % "14.0.1",
    -        "com.google.code.findbugs"   % "jsr305"           % "1.3.9",
    -        "log4j"                      % "log4j"            % "1.2.17",
    -        "org.slf4j"                  % "slf4j-api"        % slf4jVersion,
    -        "org.slf4j"                  % "slf4j-log4j12"    % slf4jVersion,
    -        "org.slf4j"                  % "jul-to-slf4j"     % slf4jVersion,
    -        "org.slf4j"                  % "jcl-over-slf4j"   % slf4jVersion,
    -        "commons-daemon"             % "commons-daemon"   % "1.0.10", // workaround for bug HADOOP-9407
    -        "com.ning"                   % "compress-lzf"     % "1.0.0",
    -        "org.xerial.snappy"          % "snappy-java"      % "1.0.5",
    -        "org.spark-project.akka"    %% "akka-remote"      % akkaVersion excludeAll(excludeNetty),
    -        "org.spark-project.akka"    %% "akka-slf4j"       % akkaVersion excludeAll(excludeNetty),
    -        "org.spark-project.akka"    %% "akka-testkit"     % akkaVersion % "test",
    -        "org.json4s"                %% "json4s-jackson"   % "3.2.6" excludeAll(excludeScalap),
    -        "it.unimi.dsi"               % "fastutil"         % "6.4.4",
    -        "colt"                       % "colt"             % "1.2.0",
    -        "org.apache.mesos"           % "mesos"            % "0.13.0",
    -        "commons-net"                % "commons-net"      % "2.2",
    -        "net.java.dev.jets3t"        % "jets3t"           % "0.7.1" excludeAll(excludeCommonsLogging),
    -        "org.apache.derby"           % "derby"            % "10.4.2.0"                     % "test",
    -        "org.apache.hadoop"          % hadoopClient       % hadoopVersion excludeAll(excludeNetty, excludeAsm, excludeCommonsLogging, excludeSLF4J, excludeOldAsm),
    -        "org.apache.curator"         % "curator-recipes"  % "2.4.0" excludeAll(excludeNetty),
    -        "com.codahale.metrics"       % "metrics-core"     % codahaleMetricsVersion,
    -        "com.codahale.metrics"       % "metrics-jvm"      % codahaleMetricsVersion,
    -        "com.codahale.metrics"       % "metrics-json"     % codahaleMetricsVersion,
    -        "com.codahale.metrics"       % "metrics-graphite" % codahaleMetricsVersion,
    -        "com.twitter"               %% "chill"            % chillVersion excludeAll(excludeAsm),
    -        "com.twitter"                % "chill-java"       % chillVersion excludeAll(excludeAsm),
    -        "org.tachyonproject"         % "tachyon"          % "0.4.1-thrift" excludeAll(excludeHadoop, excludeCurator, excludeEclipseJetty, excludePowermock),
    -        "com.clearspring.analytics"  % "stream"           % "2.5.1"
    +        "com.google.guava"           % "guava"               % "14.0.1",
    +        "com.google.code.findbugs"   % "jsr305"              % "1.3.9",
    +        "log4j"                      % "log4j"               % "1.2.17",
    +        "org.slf4j"                  % "slf4j-api"           % slf4jVersion,
    +        "org.slf4j"                  % "slf4j-log4j12"       % slf4jVersion,
    +        "org.slf4j"                  % "jul-to-slf4j"        % slf4jVersion,
    +        "org.slf4j"                  % "jcl-over-slf4j"      % slf4jVersion,
    +        "com.typesafe"               %% "scalalogging-slf4j" % "1.0.1",
    +        "commons-daemon"             % "commons-daemon"      % "1.0.10", // workaround for bug HADOOP-9407
    +        "com.ning"                   % "compress-lzf"        % "1.0.0",
    +        "org.xerial.snappy"          % "snappy-java"         % "1.0.5",
    +        "org.spark-project.akka"    %% "akka-remote"         % akkaVersion excludeAll(excludeNetty),
    +        "org.spark-project.akka"    %% "akka-slf4j"          % akkaVersion excludeAll(excludeNetty),
    +        "org.spark-project.akka"    %% "akka-testkit"        % akkaVersion % "test",
    +        "org.json4s"                %% "json4s-jackson"      % "3.2.6" excludeAll(excludeScalap),
    +        "it.unimi.dsi"               % "fastutil"            % "6.4.4",
    +        "colt"                       % "colt"                % "1.2.0",
    +        "org.apache.mesos"           % "mesos"               % "0.13.0",
    +        "commons-net"                % "commons-net"         % "2.2",
    +        "net.java.dev.jets3t"        % "jets3t"              % "0.7.1" excludeAll(excludeCommonsLogging),
    +        "org.apache.derby"           % "derby"               % "10.4.2.0"                     % "test",
    +        "org.apache.hadoop"          % hadoopClient          % hadoopVersion excludeAll(excludeNetty, excludeAsm, excludeCommonsLogging, excludeSLF4J, excludeOldAsm),
    +        "org.apache.curator"         % "curator-recipes"     % "2.4.0" excludeAll(excludeNetty),
    +        "com.codahale.metrics"       % "metrics-core"        % codahaleMetricsVersion,
    +        "com.codahale.metrics"       % "metrics-jvm"         % codahaleMetricsVersion,
    +        "com.codahale.metrics"       % "metrics-json"        % codahaleMetricsVersion,
    +        "com.codahale.metrics"       % "metrics-graphite"    % codahaleMetricsVersion,
    +        "com.twitter"               %% "chill"               % chillVersion excludeAll(excludeAsm),
    +        "com.twitter"                % "chill-java"          % chillVersion excludeAll(excludeAsm),
    +        "org.tachyonproject"         % "tachyon"             % "0.4.1-thrift" excludeAll(excludeHadoop, excludeCurator, excludeEclipseJetty, excludePowermock),
    +        "com.clearspring.analytics"  % "stream"             % "2.5.1"
    --- End diff --
    
    nit: re-add space before the last % to align it again


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1470] Use the scala-logging wrapper ins...

Posted by witgo <gi...@git.apache.org>.

Github user witgo commented on the pull request:

    https://github.com/apache/spark/pull/332#issuecomment-48708675
  
    It can't automatic test. I submit a new PR #1369. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1470] Use the scala-logging wrapper ins...

Posted by witgo <gi...@git.apache.org>.

Github user witgo commented on the pull request:

    https://github.com/apache/spark/pull/332#issuecomment-46771412
  
    Jenkins, test this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1470] Spark logger moving to use scala-...

Posted by aarondav <gi...@git.apache.org>.

Github user aarondav commented on a diff in the pull request:

    https://github.com/apache/spark/pull/332#discussion_r12509797
  
    --- Diff: core/src/main/scala/org/apache/spark/Logging.scala ---
    @@ -116,7 +121,8 @@ trait Logging {
         val log4jInitialized = LogManager.getRootLogger.getAllAppenders.hasMoreElements
         if (!log4jInitialized && usingLog4j) {
           val defaultLogProps = "org/apache/spark/log4j-defaults.properties"
    -      Option(Utils.getSparkClassLoader.getResource(defaultLogProps)) match {
    +      val classLoader = this.getClass.getClassLoader
    --- End diff --
    
    Why was this change made? It's almost the same thing as Utils.getSparkClassLoader, just not using that abstraction?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: remove scalalogging-slf4j dependency

Posted by srowen <gi...@git.apache.org>.

Github user srowen commented on the pull request:

    https://github.com/apache/spark/pull/332#issuecomment-39651255
  
    (For what my $0.02 may be worth it seems worth it just for the consolidation.)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Spark logger moving to use scala-logging

Posted by pwendell <gi...@git.apache.org>.

Github user pwendell commented on the pull request:

    https://github.com/apache/spark/pull/332#issuecomment-39696780
  
    One solution is to have these older messages for compatibility (with deprecation) but then also expose the underlying log directly to users and convert spark's code to use the new version. The `Logging` trait in Spark is public so it's likely downstream applications are using it.
    
    It's too bad though - the whole reason `Logging` exists is to provide an indirection layer so we don't have to depend on specific implementations but that's exactly what will cause performance overhead in this case.
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: remove scalalogging-slf4j dependency

Posted by witgo <gi...@git.apache.org>.

Github user witgo commented on the pull request:

    https://github.com/apache/spark/pull/332#issuecomment-39655050
  
    We should use unified logging interfaces and overall logging system to migrate to scalalogging is a good idea.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: remove scalalogging-slf4j dependency

Posted by pwendell <gi...@git.apache.org>.

Github user pwendell commented on the pull request:

    https://github.com/apache/spark/pull/332#issuecomment-39656388
  
    Ah I see. If that's the case, how does this differ from the checks we already do inside of Spark's normal logging?
    
    https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/Logging.scala#L54
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1470] Spark logger moving to use scala-...

Posted by witgo <gi...@git.apache.org>.

Github user witgo commented on a diff in the pull request:

    https://github.com/apache/spark/pull/332#discussion_r12512463
  
    --- Diff: project/SparkBuild.scala ---
    @@ -317,6 +317,7 @@ object SparkBuild extends Build {
       val excludeFastutil = ExclusionRule(organization = "it.unimi.dsi")
       val excludeJruby = ExclusionRule(organization = "org.jruby")
       val excludeThrift = ExclusionRule(organization = "org.apache.thrift")
    +  val excludeScalalogging= ExclusionRule(organization = "com.typesafe", artifact = "scalalogging-slf4j")
    --- End diff --
    
    I have not tested this case,and I think that this code is not readable.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: SPARK-1470: Spark logger moving to use scala-l...

Posted by witgo <gi...@git.apache.org>.

Github user witgo commented on the pull request:

    https://github.com/apache/spark/pull/332#issuecomment-40674103
  
    ``` scala
    // inline
    while (i < 1000000) {
      if (log.isInfoEnabled)  // will only perform here more than micro
        println("" + Thread.currentThread.getStackTrace().
          toSeq.take(10).mkString("\\n"))
      i += 1
    }
    
    // micro
    while (i < 1000000) {
      i += 1
    }
    ```
    
    Just a simple if block will not activate collect garbage.I do not think you can test out the gap.
    Weekend, I wrote a more complex test code.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: remove scalalogging-slf4j dependency

Posted by pwendell <gi...@git.apache.org>.

Github user pwendell commented on the pull request:

    https://github.com/apache/spark/pull/332#issuecomment-39677214
  
    The thing is - adding a dependency is not free. It means that everyone who packages spark downstream needs to deal with a bunch of annoying stuff. For instance, this introduces a new way for us to get an unintentional version bump of slf4j in the Spark build. It also means that if users are upgrading sfl4j on their own, they might have an issue where scala-logging doesn't work against newer versions of slf4j. It means that OS packagers will have to create a new RPM/Debian package for scala-logging if one doesn't exist. Basically, my point is we should have a high bar for any dependency.
    
    If we go with the default plan of "add it if it provides nonzero benefit" - that's way too low of a bar and would result in an unwieldy project.
    
    Anyways on this one, maybe the best answer is to just re-write the spark logging to use scala-logging. Seems like it would be very straightforward. From my end, the main concern is just consolidating it with the normal spark logging.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Spark logger moving to use scala-logging

Posted by pwendell <gi...@git.apache.org>.

Github user pwendell commented on the pull request:

    https://github.com/apache/spark/pull/332#issuecomment-39696209
  
    Hm good call @aarondav I'm not sure if that would work. If it doesn't then would there be any other way to do this than to just directly use the scala logger?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1470][SPARK-1842] Use the scala-logging...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/332#issuecomment-50961189
  
    QA tests have started for PR 332. This patch merges cleanly. <br>View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17767/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: remove scalalogging-slf4j dependency

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/332#issuecomment-39639841
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Spark logger moving to use scala-logging

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/332#issuecomment-39694868
  
     Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: remove scalalogging-slf4j dependency

Posted by markhamstra <gi...@git.apache.org>.

Github user markhamstra commented on the pull request:

    https://github.com/apache/spark/pull/332#issuecomment-39676694
  
    Very similar to what I was thinking @marmbrus -- but you wrote it up more completely than I would have.  This is also why I've been moving some performance-critical proprietary code over to using Scala logging.  It isn't huge, but it does make a difference, and it's generally no more difficult to use Scala logging than older methods, so I'd also like to see Spark moving to use Typesafe's library. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1470] Spark logger moving to use scala-...

Posted by witgo <gi...@git.apache.org>.

Github user witgo commented on a diff in the pull request:

    https://github.com/apache/spark/pull/332#discussion_r12512415
  
    --- Diff: core/src/main/scala/org/apache/spark/Logging.scala ---
    @@ -116,7 +121,8 @@ trait Logging {
         val log4jInitialized = LogManager.getRootLogger.getAllAppenders.hasMoreElements
         if (!log4jInitialized && usingLog4j) {
           val defaultLogProps = "org/apache/spark/log4j-defaults.properties"
    -      Option(Utils.getSparkClassLoader.getResource(defaultLogProps)) match {
    +      val classLoader = this.getClass.getClassLoader
    --- End diff --
    
    Uh, I'm sorry, I think this is a problem caused by the merging code.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---