dev@spark.apache.org, 2016-02

You are viewing a plain text version of this content. The canonical link for it is here.

- sbt publish-local fails with 2.0.0-SNAPSHOT - posted by Mike Hynes <91...@gmail.com> on 2016/02/01 08:01:14 UTC, 2 replies.
- Spark job does not perform well when some RDD in memory and some on Disk - posted by Prabhu Joseph <pr...@gmail.com> on 2016/02/01 09:32:44 UTC, 2 replies.
- Guidelines for writing SPARK packages - posted by Praveen Devarao <pr...@in.ibm.com> on 2016/02/01 12:56:47 UTC, 0 replies.
- Spark Executor retries infinitely - posted by Prabhu Joseph <pr...@gmail.com> on 2016/02/01 13:16:26 UTC, 0 replies.
- Re: Scala 2.11 default build - posted by Steve Loughran <st...@hortonworks.com> on 2016/02/01 13:22:43 UTC, 8 replies.
- [ANNOUNCE] New SAMBA Package = Spark + AWS Lambda - posted by David Russell <th...@gmail.com> on 2016/02/01 13:23:43 UTC, 1 replies.
- Re: Spark 1.6.1 - posted by Ted Yu <yu...@gmail.com> on 2016/02/01 16:29:09 UTC, 24 replies.
- Encrypting jobs submitted by the client - posted by eugene miretsky <eu...@gmail.com> on 2016/02/01 21:48:22 UTC, 3 replies.
- Secure multi tenancy on in stand alone mode - posted by eugene miretsky <eu...@gmail.com> on 2016/02/01 22:02:02 UTC, 1 replies.
- Spark saveAsHadoopFile stage fails with ExecutorLostfailure - posted by Prabhu Joseph <pr...@gmail.com> on 2016/02/02 15:21:56 UTC, 0 replies.
- Spark 1.6.0 Streaming + Persistance Bug? - posted by mkhaitman <ma...@chango.com> on 2016/02/02 16:46:05 UTC, 1 replies.
- Lunch dev/run-tests on Windows - posted by Wen Pei Yu <yu...@cn.ibm.com> on 2016/02/03 08:40:53 UTC, 0 replies.
- RE: spark hivethriftserver problem on 1.5.0 -> 1.6.0 upgrade - posted by "james.green9@baesystems.com" <ja...@baesystems.com> on 2016/02/03 13:17:35 UTC, 0 replies.
- SparkOscope: Enabling Spark Optimization through Cross-stack Monitoring and Visualization - posted by Yiannis Gkoufas <jo...@gmail.com> on 2016/02/03 16:26:56 UTC, 2 replies.
- Path to resource added with SQL: ADD FILE - posted by Antonio Piccolboni <an...@piccolboni.info> on 2016/02/03 19:17:05 UTC, 2 replies.
- Spark 1.6: Why Including hive-jdbc in assembly when -Phive-provided is set? - posted by Andrew Lee <al...@hotmail.com> on 2016/02/03 19:57:50 UTC, 0 replies.
- TakeOrderedAndProject operator may causes an OOM - posted by 汪洋 <ti...@icloud.com> on 2016/02/04 08:59:00 UTC, 1 replies.
- Interested in Contributing to Spark as GSoC 2016 - posted by Tao Lin <nb...@gmail.com> on 2016/02/04 11:46:43 UTC, 1 replies.
- Re: Using CUDA within Spark / boosting linear algebra - posted by Max Grossman <jm...@rice.edu> on 2016/02/04 16:13:05 UTC, 2 replies.
- Building Spark with Custom Hadoop Version - posted by Charles Wright <ch...@live.ca> on 2016/02/05 00:03:45 UTC, 3 replies.
- [GSoC] Interested in GSoC 2016 ideas - posted by Kai Jiang <ji...@gmail.com> on 2016/02/05 07:53:57 UTC, 0 replies.
- Spark process failing to receive data from the Kafka queue in yarn-client mode. - posted by Rachana Srivastava <Ra...@markmonitor.com> on 2016/02/05 18:38:18 UTC, 0 replies.
- Preserving partitioning with dataframe select - posted by Matt Cheah <mc...@palantir.com> on 2016/02/05 22:49:14 UTC, 3 replies.
- Fwd: Writing to jdbc database from SparkR (1.5.2) - posted by Andrew Holway <an...@otternetworks.de> on 2016/02/06 20:22:30 UTC, 6 replies.
- pyspark worker concurrency - posted by Renyi Xiong <re...@gmail.com> on 2016/02/07 03:27:38 UTC, 1 replies.
- Scala API: simplifying common patterns - posted by sim <si...@swoop.com> on 2016/02/08 00:29:09 UTC, 8 replies.
- Welcoming two new committers - posted by Matei Zaharia <ma...@gmail.com> on 2016/02/08 18:15:34 UTC, 18 replies.
- [build system] brief downtime, 8am PST thursday feb 10th - posted by shane knapp <sk...@berkeley.edu> on 2016/02/08 18:27:40 UTC, 3 replies.
- Spark in Production - Use Cases - posted by Scott walent <sc...@gmail.com> on 2016/02/08 23:12:34 UTC, 0 replies.
- spark on yarn wastes one box (or 1 GB on each box) for am container - posted by Alexander Pivovarov <ap...@gmail.com> on 2016/02/09 06:03:03 UTC, 21 replies.
- Re: Long running Spark job on YARN throws "No AMRMToken" - posted by Prabhu Joseph <pr...@gmail.com> on 2016/02/09 06:55:57 UTC, 5 replies.
- Error aliasing an array column. - posted by rakeshchalasani <vn...@gmail.com> on 2016/02/09 20:55:23 UTC, 7 replies.
- map-side-combine in Spark SQL - posted by Rishitesh Mishra <ri...@gmail.com> on 2016/02/10 05:37:31 UTC, 1 replies.
- Re: Kmeans++ using 1 core only Was: Slowness in Kmeans calculating fastSquaredDistance - posted by Li Ming Tsai <ma...@ltsai.com> on 2016/02/10 06:59:45 UTC, 0 replies.
- Re: Spark Job on YARN accessing Hbase Table - posted by Prabhu Joseph <pr...@gmail.com> on 2016/02/10 09:17:16 UTC, 1 replies.
- Introducing spark-sklearn, a scikit-learn integration package for Spark - posted by Tim Hunter <ti...@databricks.com> on 2016/02/10 18:13:59 UTC, 0 replies.
- SPARK_WORKER_MEMORY in Spark Standalone - conf.getenv vs System.getenv? - posted by Jacek Laskowski <ja...@japila.pl> on 2016/02/11 14:51:06 UTC, 6 replies.
- [build system] additional jenkins downtime next thursday - posted by shane knapp <sk...@berkeley.edu> on 2016/02/11 19:19:46 UTC, 4 replies.
- Re: Making BatchPythonEvaluation actually Batch - posted by Davies Liu <da...@databricks.com> on 2016/02/11 20:26:44 UTC, 0 replies.
- Operations on DataFrames with User Defined Types in pyspark - posted by Franklyn D'souza <fr...@shopify.com> on 2016/02/11 22:42:03 UTC, 0 replies.
- Spark Summit San Francisco 2016 call for presentations (CFP) - posted by Reynold Xin <rx...@apache.org> on 2016/02/11 23:52:39 UTC, 0 replies.
- Building Spark with a Custom Version of Hadoop: HDFS ClassNotFoundException - posted by Charlie Wright <ch...@live.ca> on 2016/02/12 02:15:45 UTC, 1 replies.
- Spark SQL performance: version 1.6 vs version 1.5 - posted by Le Tien Dung <ti...@gmail.com> on 2016/02/12 16:23:57 UTC, 2 replies.
- Saving a Pipeline with DecisionTreeModel Spark ML - posted by gstvolvr <g....@gmail.com> on 2016/02/12 20:34:01 UTC, 1 replies.
- Re: Spark 1.6.0 + Hive + HBase - posted by chutium <te...@gmail.com> on 2016/02/15 15:45:39 UTC, 0 replies.
- Dataset in spark 2.0.0-SNAPSHOT missing columns - posted by Koert Kuipers <ko...@tresata.com> on 2016/02/15 16:12:29 UTC, 2 replies.
- Subscribe - posted by Jayesh Thakrar <j_...@yahoo.com.INVALID> on 2016/02/15 17:01:50 UTC, 0 replies.
- Call wholeTextFiles to read gzip files - posted by Deepak Gopalakrishnan <dg...@gmail.com> on 2016/02/16 11:17:00 UTC, 1 replies.
- DataFrame API and Ordering - posted by Maciej Szymkiewicz <ms...@gmail.com> on 2016/02/17 05:28:54 UTC, 3 replies.
- FYI: github is getting DDOSed - posted by shane knapp <sk...@berkeley.edu> on 2016/02/17 23:19:48 UTC, 0 replies.
- pull request template - posted by Reynold Xin <rx...@databricks.com> on 2016/02/18 04:18:46 UTC, 3 replies.
- SPARK-9559 - posted by Ashish Soni <as...@gmail.com> on 2016/02/18 16:13:32 UTC, 1 replies.
- Kafka connector mention in Matei's keynote - posted by Cody Koeninger <co...@koeninger.org> on 2016/02/18 17:59:33 UTC, 1 replies.
- How to run PySpark tests? - posted by Jason White <ja...@shopify.com> on 2016/02/19 03:07:41 UTC, 4 replies.
- Ability to auto-detect input data for datasources (by file extension). - posted by Hyukjin Kwon <gu...@gmail.com> on 2016/02/19 03:25:55 UTC, 1 replies.
- Concurreny does not improve for Spark Jobs with Same Spark Context - posted by Prabhu Joseph <pr...@gmail.com> on 2016/02/19 06:51:35 UTC, 2 replies.
- Re: Write access to wiki - posted by Holden Karau <ho...@pigscanfly.ca> on 2016/02/19 16:42:43 UTC, 0 replies.
- Using Encoding to reduce GraphX's static graph memory consumption - posted by ahaider3 <ah...@hawk.iit.edu> on 2016/02/21 05:05:21 UTC, 4 replies.
- How do we run that PR auto-close script again? - posted by Sean Owen <so...@cloudera.com> on 2016/02/22 10:17:11 UTC, 2 replies.
- a new FileFormat 5x~100x faster than parquet - posted by 开心延年 <mu...@qq.com> on 2016/02/22 12:14:09 UTC, 0 replies.
- 回复：a new FileFormat 5x~100x faster than parquet - posted by 平平 <xu...@qq.com> on 2016/02/22 12:46:20 UTC, 1 replies.
- Re: a new FileFormat 5x~100x faster than parquet - posted by Akhil Das <ak...@sigmoidanalytics.com> on 2016/02/22 13:12:46 UTC, 0 replies.
- 回复： a new FileFormat 5x~100x faster than parquet - posted by 开心延年 <mu...@qq.com> on 2016/02/22 13:27:34 UTC, 1 replies.
- Builds are failing - posted by Iulian Dragoș <iu...@typesafe.com> on 2016/02/22 13:27:41 UTC, 0 replies.
- 回复：回复： a new FileFormat 5x~100x faster than parquet - posted by 开心延年 <mu...@qq.com> on 2016/02/22 15:03:36 UTC, 1 replies.
- [build system] jenkins restarted - posted by shane knapp <sk...@berkeley.edu> on 2016/02/22 22:42:49 UTC, 2 replies.
- Re: Spark not able to fetch events from Amazon Kinesis - posted by Yash Sharma <ya...@gmail.com> on 2016/02/23 02:58:41 UTC, 0 replies.
- Opening a JIRA for QuantileDiscretizer bug - posted by "Pierson, Oliver C" <oc...@gatech.edu> on 2016/02/23 03:45:07 UTC, 2 replies.
- spark core api vs. google cloud dataflow - posted by lonely Feb <lo...@gmail.com> on 2016/02/23 09:16:45 UTC, 1 replies.
- Re: Accessing Web UI - posted by Vasanth Bhat <va...@gmail.com> on 2016/02/23 10:15:48 UTC, 1 replies.
- ORC file writing hangs in pyspark - posted by James Barney <ja...@gmail.com> on 2016/02/23 15:05:31 UTC, 3 replies.
- Modify text in spark-packages - posted by Sergio Ramírez <sr...@ugr.es> on 2016/02/23 18:33:34 UTC, 0 replies.
- Fwd: HANA data access from SPARK - posted by Dushyant Rajput <du...@gmail.com> on 2016/02/23 22:46:41 UTC, 0 replies.
- Spark Job on YARN Hogging the entire Cluster resource - posted by Prabhu Joseph <pr...@gmail.com> on 2016/02/24 02:19:53 UTC, 5 replies.
- Build fails - posted by Minudika Malshan <mi...@gmail.com> on 2016/02/24 20:28:19 UTC, 4 replies.
- Spark Summit (San Francisco, June 6-8) call for presentation due in less than week - posted by Reynold Xin <rx...@apache.org> on 2016/02/24 22:50:18 UTC, 0 replies.
- how about a custom coalesce() policy? - posted by Nezih Yigitbasi <ny...@netflix.com.INVALID> on 2016/02/24 23:08:30 UTC, 2 replies.
- Spark HANA jdbc connection issue - posted by Dushyant Rajput <du...@gmail.com> on 2016/02/25 00:34:39 UTC, 0 replies.
- Bug in DiskBlockManager subDirs logic? - posted by Zee Chen <ze...@gmail.com> on 2016/02/25 09:24:25 UTC, 0 replies.
- Eclipse: Wrong project dependencies in generated by "sbt eclipse" - posted by lgieron <lg...@gmail.com> on 2016/02/25 14:55:49 UTC, 6 replies.
- [discuss] DataFrame vs Dataset in Spark 2.0 - posted by Reynold Xin <rx...@databricks.com> on 2016/02/26 00:23:33 UTC, 10 replies.
- Re: DirectFileOutputCommiter - posted by Teng Qiu <te...@gmail.com> on 2016/02/26 11:43:20 UTC, 0 replies.
- Is spark.driver.maxResultSize used correctly ? - posted by Jeff Zhang <zj...@gmail.com> on 2016/02/26 11:44:30 UTC, 2 replies.
- Fwd: Aggregation + Adding static column + Union + Projection = Problem - posted by Jiří Syrový <sy...@gmail.com> on 2016/02/26 15:11:19 UTC, 1 replies.
- make-distribution.sh fails because tachyon-project was renamed to Alluxio - posted by Jong Wook Kim <il...@gmail.com> on 2016/02/26 16:47:13 UTC, 3 replies.
- More Robust DataSource Parameters - posted by Hamel Kothari <ha...@gmail.com> on 2016/02/26 17:44:33 UTC, 2 replies.
- Re: Hbase in spark - posted by Ted Yu <yu...@gmail.com> on 2016/02/26 17:55:40 UTC, 1 replies.
- Upgrading to Kafka 0.9.x - posted by Mark Grover <ma...@apache.org> on 2016/02/26 18:46:17 UTC, 2 replies.
- External dependencies in public APIs (was previously: Upgrading to Kafka 0.9.x) - posted by Reynold Xin <rx...@databricks.com> on 2016/02/26 20:49:07 UTC, 0 replies.
- some joins stopped working with spark 2.0.0 SNAPSHOT - posted by Koert Kuipers <ko...@tresata.com> on 2016/02/27 06:54:58 UTC, 3 replies.
- Spark Checkpointing behavior - posted by Tarek Elgamal <ta...@gmail.com> on 2016/02/27 09:58:48 UTC, 0 replies.
- beeline and spark-defaults.conf - posted by longsonr <lo...@gmail.com> on 2016/02/27 17:27:45 UTC, 0 replies.
- Spark log4j fully qualified class name - posted by Prabhu Joseph <pr...@gmail.com> on 2016/02/27 21:40:27 UTC, 3 replies.
- spark yarn exec container fails if yarn.nodemanager.local-dirs value starts with file:// - posted by Alexander Pivovarov <ap...@gmail.com> on 2016/02/27 22:31:36 UTC, 0 replies.
- Implementing Bagging ensemble method using spark.mlLib - posted by Minudika Malshan <mi...@gmail.com> on 2016/02/28 19:32:05 UTC, 0 replies.
- Control the stdout and stderr streams in a executor JVM - posted by Niranda Perera <ni...@gmail.com> on 2016/02/29 06:50:27 UTC, 2 replies.
- Mapper side join with DataFrames API - posted by Deepak Gopalakrishnan <dg...@gmail.com> on 2016/02/29 16:45:28 UTC, 1 replies.
- What should be spark.local.dir in spark on yarn? - posted by Alexander Pivovarov <ap...@gmail.com> on 2016/02/29 20:12:50 UTC, 0 replies.