You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by zhaorongsheng <gi...@git.apache.org> on 2016/05/14 14:40:38 UTC

[GitHub] spark pull request: update from orign

GitHub user zhaorongsheng opened a pull request:

    https://github.com/apache/spark/pull/13118

    update from orign

    ## What changes were proposed in this pull request?
    
    (Please fill in changes proposed in this fix)
    
    
    ## How was this patch tested?
    
    (Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)
    
    
    (If this patch involves UI changes, please attach a screenshot; otherwise, remove this)
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/zhaorongsheng/spark master

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/13118.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #13118
    
----
commit 00a39d9c05c55b5ffcd4f49aadc91cedf227669a
Author: Patrick Wendell <pw...@gmail.com>
Date:   2015-12-15T23:09:57Z

    Preparing Spark release v1.6.0-rc3

commit 08aa3b47e6a295a8297e741effa14cd0d834aea8
Author: Patrick Wendell <pw...@gmail.com>
Date:   2015-12-15T23:10:04Z

    Preparing development version 1.6.0-SNAPSHOT

commit 9e4ac56452710ddd8efb695e69c8de49317e3f28
Author: tedyu <yu...@gmail.com>
Date:   2015-12-16T02:15:10Z

    [SPARK-12056][CORE] Part 2 Create a TaskAttemptContext only after calling setConf
    
    This is continuation of SPARK-12056 where change is applied to SqlNewHadoopRDD.scala
    
    andrewor14
    FYI
    
    Author: tedyu <yu...@gmail.com>
    
    Closes #10164 from tedyu/master.
    
    (cherry picked from commit f725b2ec1ab0d89e35b5e2d3ddeddb79fec85f6d)
    Signed-off-by: Andrew Or <an...@databricks.com>

commit 2c324d35a698b353c2193e2f9bd8ba08c741c548
Author: Timothy Chen <tn...@gmail.com>
Date:   2015-12-16T02:20:00Z

    [SPARK-12351][MESOS] Add documentation about submitting Spark with mesos cluster mode.
    
    Adding more documentation about submitting jobs with mesos cluster mode.
    
    Author: Timothy Chen <tn...@gmail.com>
    
    Closes #10086 from tnachen/mesos_supervise_docs.
    
    (cherry picked from commit c2de99a7c3a52b0da96517c7056d2733ef45495f)
    Signed-off-by: Andrew Or <an...@databricks.com>

commit 8e9a600313f3047139d3cebef85acc782903123b
Author: Naveen <na...@gmail.com>
Date:   2015-12-16T02:25:22Z

    [SPARK-9886][CORE] Fix to use ShutdownHookManager in
    
    ExternalBlockStore.scala
    
    Author: Naveen <na...@gmail.com>
    
    Closes #10313 from naveenminchu/branch-fix-SPARK-9886.
    
    (cherry picked from commit 8a215d2338c6286253e20122640592f9d69896c8)
    Signed-off-by: Andrew Or <an...@databricks.com>

commit 93095eb29a1e59dbdbf6220bfa732b502330e6ae
Author: Bryan Cutler <bj...@us.ibm.com>
Date:   2015-12-16T02:28:16Z

    [SPARK-12062][CORE] Change Master to asyc rebuild UI when application completes
    
    This change builds the event history of completed apps asynchronously so the RPC thread will not be blocked and allow new workers to register/remove if the event log history is very large and takes a long time to rebuild.
    
    Author: Bryan Cutler <bj...@us.ibm.com>
    
    Closes #10284 from BryanCutler/async-MasterUI-SPARK-12062.
    
    (cherry picked from commit c5b6b398d5e368626e589feede80355fb74c2bd8)
    Signed-off-by: Andrew Or <an...@databricks.com>

commit fb08f7b784bc8b5e0cd110f315f72c7d9fc65e08
Author: Wenchen Fan <cl...@outlook.com>
Date:   2015-12-16T02:29:19Z

    [SPARK-10477][SQL] using DSL in ColumnPruningSuite to improve readability
    
    Author: Wenchen Fan <cl...@outlook.com>
    
    Closes #8645 from cloud-fan/test.
    
    (cherry picked from commit a89e8b6122ee5a1517fbcf405b1686619db56696)
    Signed-off-by: Andrew Or <an...@databricks.com>

commit a2d584ed9ab3c073df057bed5314bdf877a47616
Author: Timothy Hunter <ti...@databricks.com>
Date:   2015-12-16T18:12:33Z

    [SPARK-12324][MLLIB][DOC] Fixes the sidebar in the ML documentation
    
    This fixes the sidebar, using a pure CSS mechanism to hide it when the browser's viewport is too narrow.
    Credit goes to the original author Titan-C (mentioned in the NOTICE).
    
    Note that I am not a CSS expert, so I can only address comments up to some extent.
    
    Default view:
    <img width="936" alt="screen shot 2015-12-14 at 12 46 39 pm" src="https://cloud.githubusercontent.com/assets/7594753/11793597/6d1d6eda-a261-11e5-836b-6eb2054e9054.png">
    
    When collapsed manually by the user:
    <img width="1004" alt="screen shot 2015-12-14 at 12 54 02 pm" src="https://cloud.githubusercontent.com/assets/7594753/11793669/c991989e-a261-11e5-8bf6-aecf3bdb6319.png">
    
    Disappears when column is too narrow:
    <img width="697" alt="screen shot 2015-12-14 at 12 47 22 pm" src="https://cloud.githubusercontent.com/assets/7594753/11793607/7754dbcc-a261-11e5-8b15-e0d074b0e47c.png">
    
    Can still be opened by the user if necessary:
    <img width="651" alt="screen shot 2015-12-14 at 12 51 15 pm" src="https://cloud.githubusercontent.com/assets/7594753/11793612/7bf82968-a261-11e5-9cc3-e827a7a6b2b0.png">
    
    Author: Timothy Hunter <ti...@databricks.com>
    
    Closes #10297 from thunterdb/12324.
    
    (cherry picked from commit a6325fc401f68d9fa30cc947c44acc9d64ebda7b)
    Signed-off-by: Joseph K. Bradley <jo...@databricks.com>

commit ac0e2ea7c712e91503b02ae3c12fa2fcf5079886
Author: Yanbo Liang <yb...@gmail.com>
Date:   2015-12-16T18:34:30Z

    [SPARK-12310][SPARKR] Add write.json and write.parquet for SparkR
    
    Add ```write.json``` and ```write.parquet``` for SparkR, and deprecated ```saveAsParquetFile```.
    
    Author: Yanbo Liang <yb...@gmail.com>
    
    Closes #10281 from yanboliang/spark-12310.
    
    (cherry picked from commit 22f6cd86fc2e2d6f6ad2c3aae416732c46ebf1b1)
    Signed-off-by: Shivaram Venkataraman <sh...@cs.berkeley.edu>

commit 16edd933d7323f8b6861409bbd62bc1efe244c14
Author: Yu ISHIKAWA <yu...@gmail.com>
Date:   2015-12-16T18:43:45Z

    [SPARK-12215][ML][DOC] User guide section for KMeans in spark.ml
    
    cc jkbradley
    
    Author: Yu ISHIKAWA <yu...@gmail.com>
    
    Closes #10244 from yu-iskw/SPARK-12215.
    
    (cherry picked from commit 26d70bd2b42617ff731b6e9e6d77933b38597ebe)
    Signed-off-by: Joseph K. Bradley <jo...@databricks.com>

commit f815127294c06320204d9affa4f35da7ec3a710d
Author: Jeff Zhang <zj...@apache.org>
Date:   2015-12-16T18:32:32Z

    [SPARK-12318][SPARKR] Save mode in SparkR should be error by default
    
    shivaram  Please help review.
    
    Author: Jeff Zhang <zj...@apache.org>
    
    Closes #10290 from zjffdu/SPARK-12318.
    
    (cherry picked from commit 2eb5af5f0d3c424dc617bb1a18dd0210ea9ba0bc)
    Signed-off-by: Shivaram Venkataraman <sh...@cs.berkeley.edu>

commit e5b85713d8a0dbbb1a0a07481f5afa6c5098147f
Author: Timothy Chen <tn...@gmail.com>
Date:   2015-12-16T18:54:15Z

    [SPARK-12345][MESOS] Filter SPARK_HOME when submitting Spark jobs with Mesos cluster mode.
    
    SPARK_HOME is now causing problem with Mesos cluster mode since spark-submit script has been changed recently to take precendence when running spark-class scripts to look in SPARK_HOME if it's defined.
    
    We should skip passing SPARK_HOME from the Spark client in cluster mode with Mesos, since Mesos shouldn't use this configuration but should use spark.executor.home instead.
    
    Author: Timothy Chen <tn...@gmail.com>
    
    Closes #10332 from tnachen/scheduler_ui.
    
    (cherry picked from commit ad8c1f0b840284d05da737fb2cc5ebf8848f4490)
    Signed-off-by: Andrew Or <an...@databricks.com>

commit e1adf6d7d1c755fb16a0030e66ce9cff348c3de8
Author: Yu ISHIKAWA <yu...@gmail.com>
Date:   2015-12-16T18:55:42Z

    [SPARK-6518][MLLIB][EXAMPLE][DOC] Add example code and user guide for bisecting k-means
    
    This PR includes only an example code in order to finish it quickly.
    I'll send another PR for the docs soon.
    
    Author: Yu ISHIKAWA <yu...@gmail.com>
    
    Closes #9952 from yu-iskw/SPARK-6518.
    
    (cherry picked from commit 7b6dc29d0ebbfb3bb941130f8542120b6bc3e234)
    Signed-off-by: Joseph K. Bradley <jo...@databricks.com>

commit 168c89e07c51fa24b0bb88582c739cec0acb44d7
Author: Patrick Wendell <pw...@gmail.com>
Date:   2015-12-16T19:23:41Z

    Preparing Spark release v1.6.0-rc3

commit aee88eb55b89bfdc763fd30f7574d2aa7de4bf39
Author: Patrick Wendell <pw...@gmail.com>
Date:   2015-12-16T19:23:52Z

    Preparing development version 1.6.0-SNAPSHOT

commit dffa6100d7d96eb38bf8a56f546d66f7a884b03f
Author: Joseph K. Bradley <jo...@databricks.com>
Date:   2015-12-16T19:53:04Z

    [SPARK-11608][MLLIB][DOC] Added migration guide for MLlib 1.6
    
    No known breaking changes, but some deprecations and changes of behavior.
    
    CC: mengxr
    
    Author: Joseph K. Bradley <jo...@databricks.com>
    
    Closes #10235 from jkbradley/mllib-guide-update-1.6.
    
    (cherry picked from commit 8148cc7a5c9f52c82c2eb7652d9aeba85e72d406)
    Signed-off-by: Joseph K. Bradley <jo...@databricks.com>

commit 04e868b63bfda5afe5cb1a0d6387fb873ad393ba
Author: Yanbo Liang <yb...@gmail.com>
Date:   2015-12-16T20:59:22Z

    [SPARK-12364][ML][SPARKR] Add ML example for SparkR
    
    We have DataFrame example for SparkR, we also need to add ML example under ```examples/src/main/r```.
    
    cc mengxr jkbradley shivaram
    
    Author: Yanbo Liang <yb...@gmail.com>
    
    Closes #10324 from yanboliang/spark-12364.
    
    (cherry picked from commit 1a8b2a17db7ab7a213d553079b83274aeebba86f)
    Signed-off-by: Joseph K. Bradley <jo...@databricks.com>

commit 552b38f87fc0f6fab61b1e5405be58908b7f5544
Author: Davies Liu <da...@databricks.com>
Date:   2015-12-16T23:48:11Z

    [SPARK-12380] [PYSPARK] use SQLContext.getOrCreate in mllib
    
    MLlib should use SQLContext.getOrCreate() instead of creating new SQLContext.
    
    Author: Davies Liu <da...@databricks.com>
    
    Closes #10338 from davies/create_context.
    
    (cherry picked from commit 27b98e99d21a0cc34955337f82a71a18f9220ab2)
    Signed-off-by: Davies Liu <da...@gmail.com>

commit 638b89bc3b1c421fe11cbaf52649225662d3d3ce
Author: Andrew Or <an...@databricks.com>
Date:   2015-12-17T00:13:48Z

    [MINOR] Add missing interpolation in NettyRPCEnv
    
    ```
    Exception in thread "main" org.apache.spark.rpc.RpcTimeoutException:
    Cannot receive any reply in ${timeout.duration}. This timeout is controlled by spark.rpc.askTimeout
    	at org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcTimeout.scala:48)
    	at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:63)
    	at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
    	at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33)
    ```
    
    Author: Andrew Or <an...@databricks.com>
    
    Closes #10334 from andrewor14/rpc-typo.
    
    (cherry picked from commit 861549acdbc11920cde51fc57752a8bc241064e5)
    Signed-off-by: Shixiong Zhu <sh...@databricks.com>

commit fb02e4e3bcc50a8f823dfecdb2eef71287225e7b
Author: Imran Rashid <ir...@cloudera.com>
Date:   2015-12-17T03:01:05Z

    [SPARK-10248][CORE] track exceptions in dagscheduler event loop in tests
    
    `DAGSchedulerEventLoop` normally only logs errors (so it can continue to process more events, from other jobs).  However, this is not desirable in the tests -- the tests should be able to easily detect any exception, and also shouldn't silently succeed if there is an exception.
    
    This was suggested by mateiz on https://github.com/apache/spark/pull/7699.  It may have already turned up an issue in "zero split job".
    
    Author: Imran Rashid <ir...@cloudera.com>
    
    Closes #8466 from squito/SPARK-10248.
    
    (cherry picked from commit 38d9795a4fa07086d65ff705ce86648345618736)
    Signed-off-by: Andrew Or <an...@databricks.com>

commit 4af64385b085002d94c54d11bbd144f9f026bbd8
Author: tedyu <yu...@gmail.com>
Date:   2015-12-17T03:02:12Z

    [SPARK-12365][CORE] Use ShutdownHookManager where Runtime.getRuntime.addShutdownHook() is called
    
    SPARK-9886 fixed ExternalBlockStore.scala
    
    This PR fixes the remaining references to Runtime.getRuntime.addShutdownHook()
    
    Author: tedyu <yu...@gmail.com>
    
    Closes #10325 from ted-yu/master.
    
    (cherry picked from commit f590178d7a06221a93286757c68b23919bee9f03)
    Signed-off-by: Andrew Or <an...@databricks.com>
    
    Conflicts:
    	sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIDriver.scala

commit 154567dca126d4992c9c9b08d71d22e9af43c995
Author: Rohit Agarwal <ro...@qubole.com>
Date:   2015-12-17T03:04:33Z

    [SPARK-12186][WEB UI] Send the complete request URI including the query string when redirecting.
    
    Author: Rohit Agarwal <ro...@qubole.com>
    
    Closes #10180 from mindprince/SPARK-12186.
    
    (cherry picked from commit fdb38227564c1af40cbfb97df420b23eb04c002b)
    Signed-off-by: Andrew Or <an...@databricks.com>

commit 4ad08035d28b8f103132da9779340c5e64e2d1c2
Author: Marcelo Vanzin <va...@cloudera.com>
Date:   2015-12-17T03:47:49Z

    [SPARK-12386][CORE] Fix NPE when spark.executor.port is set.
    
    Author: Marcelo Vanzin <va...@cloudera.com>
    
    Closes #10339 from vanzin/SPARK-12386.
    
    (cherry picked from commit d1508dd9b765489913bc948575a69ebab82f217b)
    Signed-off-by: Andrew Or <an...@databricks.com>

commit d509194b81abc3c7bf9563d26560d596e1415627
Author: Yin Huai <yh...@databricks.com>
Date:   2015-12-17T07:18:53Z

    [SPARK-12057][SQL] Prevent failure on corrupt JSON records
    
    This PR makes JSON parser and schema inference handle more cases where we have unparsed records. It is based on #10043. The last commit fixes the failed test and updates the logic of schema inference.
    
    Regarding the schema inference change, if we have something like
    ```
    {"f1":1}
    [1,2,3]
    ```
    originally, we will get a DF without any column.
    After this change, we will get a DF with columns `f1` and `_corrupt_record`. Basically, for the second row, `[1,2,3]` will be the value of `_corrupt_record`.
    
    When merge this PR, please make sure that the author is simplyianm.
    
    JIRA: https://issues.apache.org/jira/browse/SPARK-12057
    
    Closes #10043
    
    Author: Ian Macalinao <me...@ian.pw>
    Author: Yin Huai <yh...@databricks.com>
    
    Closes #10288 from yhuai/handleCorruptJson.
    
    (cherry picked from commit 9d66c4216ad830812848c657bbcd8cd50949e199)
    Signed-off-by: Reynold Xin <rx...@databricks.com>

commit da7542f2408140a9a3b7ea245350976ac18676a5
Author: echo2mei <53...@qq.com>
Date:   2015-12-17T15:59:17Z

    Once driver register successfully, stop it to connect to master.
    
    This commit is to resolve SPARK-12396.
    
    Author: echo2mei <53...@qq.com>
    
    Closes #10354 from echoTomei/master.
    
    (cherry picked from commit 5a514b61bbfb609c505d8d65f2483068a56f1f70)
    Signed-off-by: Davies Liu <da...@gmail.com>

commit a8466489ab01e59fe07ba20adfc3983ec6928157
Author: Davies Liu <da...@gmail.com>
Date:   2015-12-17T16:01:59Z

    Revert "Once driver register successfully, stop it to connect to master."
    
    This reverts commit da7542f2408140a9a3b7ea245350976ac18676a5.

commit 1ebedb20f2c5b781eafa9bf2b5ab092d744cc4fd
Author: Davies Liu <da...@databricks.com>
Date:   2015-12-17T16:04:11Z

    [SPARK-12395] [SQL] fix resulting columns of outer join
    
    For API DataFrame.join(right, usingColumns, joinType), if the joinType is right_outer or full_outer, the resulting join columns could be wrong (will be null).
    
    The order of columns had been changed to match that with MySQL and PostgreSQL [1].
    
    This PR also fix the nullability of output for outer join.
    
    [1] http://www.postgresql.org/docs/9.2/static/queries-table-expressions.html
    
    Author: Davies Liu <da...@databricks.com>
    
    Closes #10353 from davies/fix_join.
    
    (cherry picked from commit a170d34a1b309fecc76d1370063e0c4f44dc2142)
    Signed-off-by: Davies Liu <da...@gmail.com>

commit 41ad8aced2fc6c694c15e9465cfa34517b2395e8
Author: Yanbo Liang <yb...@gmail.com>
Date:   2015-12-17T17:19:46Z

    [SQL] Update SQLContext.read.text doc
    
    Since we rename the column name from ```text``` to ```value``` for DataFrame load by ```SQLContext.read.text```, we need to update doc.
    
    Author: Yanbo Liang <yb...@gmail.com>
    
    Closes #10349 from yanboliang/text-value.
    
    (cherry picked from commit 6e0771665b3c9330fc0a5b2c7740a796b4cd712e)
    Signed-off-by: Reynold Xin <rx...@databricks.com>

commit 1fbca41200d6e73cb276d5949b894881c700323f
Author: Shixiong Zhu <sh...@databricks.com>
Date:   2015-12-17T17:55:37Z

    [SPARK-12220][CORE] Make Utils.fetchFile support files that contain special characters
    
    This PR encodes and decodes the file name to fix the issue.
    
    Author: Shixiong Zhu <sh...@databricks.com>
    
    Closes #10208 from zsxwing/uri.
    
    (cherry picked from commit 86e405f357711ae93935853a912bc13985c259db)
    Signed-off-by: Shixiong Zhu <sh...@databricks.com>

commit 881f2544e13679c185a7c34ddb82e885aaa79813
Author: Iulian Dragos <ja...@gmail.com>
Date:   2015-12-17T18:19:31Z

    [SPARK-12345][MESOS] Properly filter out SPARK_HOME in the Mesos REST server
    
    Fix problem with #10332, this one should fix Cluster mode on Mesos
    
    Author: Iulian Dragos <ja...@gmail.com>
    
    Closes #10359 from dragos/issue/fix-spark-12345-one-more-time.
    
    (cherry picked from commit 8184568810e8a2e7d5371db2c6a0366ef4841f70)
    Signed-off-by: Kousuke Saruta <sa...@oss.nttdata.co.jp>

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: update from orign

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13118#issuecomment-219223888
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: update from orign

Posted by zhaorongsheng <gi...@git.apache.org>.
Github user zhaorongsheng commented on the pull request:

    https://github.com/apache/spark/pull/13118#issuecomment-219288567
  
    sorry, it is unintentional mistake.
    I close it right now!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: update from orign

Posted by zhaorongsheng <gi...@git.apache.org>.
Github user zhaorongsheng closed the pull request at:

    https://github.com/apache/spark/pull/13118


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: update from orign

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the pull request:

    https://github.com/apache/spark/pull/13118#issuecomment-219223860
  
    @zhaorongsheng it seems it is open mistakenly. I guess this might have to be closed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org