You are viewing a plain text version of this content. The canonical link for it is here.
- Re: [VOTE] Release Apache Spark 1.2.1 (RC2) - posted by Matei Zaharia <ma...@gmail.com> on 2015/02/01 04:36:25 UTC, 2 replies.
- Intellij IDEA 14 env setup; NoClassDefFoundError when run examples - posted by Yafeng Guo <da...@gmail.com> on 2015/02/01 05:01:06 UTC, 3 replies.
- Re: renaming SchemaRDD -> DataFrame - posted by Evan Chan <ve...@gmail.com> on 2015/02/01 09:31:17 UTC, 6 replies.
- Caching tables at column level - posted by Mick Davies <mi...@gmail.com> on 2015/02/01 12:03:31 UTC, 2 replies.
- Re: Any interest in 'weighting' VectorTransformer which does component-wise scaling? - posted by Octavian Geagla <og...@gmail.com> on 2015/02/01 12:53:00 UTC, 0 replies.
- Re: Custom Cluster Managers / Standalone Recovery Mode in Spark - posted by Ewan Higgs <ew...@ugent.be> on 2015/02/01 16:31:34 UTC, 2 replies.
- Word2Vec IndexedRDD - posted by Michael Malak <mi...@yahoo.com.INVALID> on 2015/02/02 03:04:09 UTC, 0 replies.
- Questions about Spark standalone resource scheduler - posted by "Shao, Saisai" <sa...@intel.com> on 2015/02/02 09:24:18 UTC, 2 replies.
- Re: Get size of rdd in memory - posted by ankits <an...@gmail.com> on 2015/02/02 21:23:45 UTC, 3 replies.
- Spark Master Maven with YARN build is broken - posted by Nicholas Chammas <ni...@gmail.com> on 2015/02/02 22:44:14 UTC, 1 replies.
- Additional fix for Avro IncompatibleClassChangeError (SPARK-3039) - posted by "M. Dale" <me...@yahoo.com.INVALID> on 2015/02/02 23:18:08 UTC, 0 replies.
- Performance test for sort shuffle - posted by Kannan Rajah <kr...@maprtech.com> on 2015/02/02 23:26:43 UTC, 1 replies.
- [spark-sql] JsonRDD - posted by Daniil Osipov <da...@shazam.com> on 2015/02/03 01:16:47 UTC, 3 replies.
- Building Spark with Pants - posted by Nicholas Chammas <ni...@gmail.com> on 2015/02/03 01:25:47 UTC, 5 replies.
- Temporary jenkins issue - posted by Patrick Wendell <pw...@gmail.com> on 2015/02/03 04:37:50 UTC, 1 replies.
- [RESULT] [VOTE] Release Apache Spark 1.2.1 (RC2) - posted by Patrick Wendell <pw...@gmail.com> on 2015/02/03 05:51:32 UTC, 0 replies.
- IDF for ml pipeline - posted by masaki rikitoku <ri...@gmail.com> on 2015/02/03 05:56:27 UTC, 2 replies.
- [VOTE] Release Apache Spark 1.2.1 (RC3) - posted by Patrick Wendell <pw...@gmail.com> on 2015/02/03 05:57:28 UTC, 8 replies.
- Can spark provide an option to start reduce stage early? - posted by Xuelin Cao <xu...@gmail.com> on 2015/02/03 06:48:53 UTC, 1 replies.
- Jenkins install reference - posted by scwf <wa...@huawei.com> on 2015/02/03 08:53:05 UTC, 2 replies.
- Accessing indices and values in SparseVector - posted by Manoj Kumar <ma...@gmail.com> on 2015/02/03 13:17:22 UTC, 2 replies.
- SparkSubmit.scala and stderr - posted by jayhutfles <ja...@gmail.com> on 2015/02/03 15:28:47 UTC, 4 replies.
- [ANNOUNCE] branch-1.3 has been cut - posted by Patrick Wendell <pw...@gmail.com> on 2015/02/03 20:59:09 UTC, 0 replies.
- Welcoming three new committers - posted by Matei Zaharia <ma...@gmail.com> on 2015/02/03 23:34:10 UTC, 24 replies.
- Re: 2GB limit for partitions? - posted by Reynold Xin <rx...@databricks.com> on 2015/02/04 00:20:43 UTC, 4 replies.
- ASF Git / GitHub sync is down - posted by Reynold Xin <rx...@databricks.com> on 2015/02/04 06:09:08 UTC, 1 replies.
- Hive window functions in 1.2+ - posted by Al Thompson <at...@gmail.com> on 2015/02/04 14:52:46 UTC, 0 replies.
- 1.2.1-rc3 - Avro input format for Hadoop 2 broken/fix? - posted by "M. Dale" <me...@yahoo.com.INVALID> on 2015/02/04 16:30:44 UTC, 2 replies.
- Spark Cluster vs Spark on YARN jar loading - posted by Sergey Belousov <se...@gmail.com> on 2015/02/04 22:08:30 UTC, 0 replies.
- ZMQ and python streaming - posted by Sasha Kacanski <sk...@gmail.com> on 2015/02/04 22:13:21 UTC, 0 replies.
- multi-line comment style - posted by Kay Ousterhout <ka...@gmail.com> on 2015/02/04 22:53:22 UTC, 10 replies.
- Spark Summit CFP - Tracks guidelines - posted by Evan Chan <ve...@gmail.com> on 2015/02/05 02:46:26 UTC, 1 replies.
- Broken record a bit here: building spark on intellij with sbt - posted by Stephen Boesch <ja...@gmail.com> on 2015/02/05 04:25:13 UTC, 4 replies.
- When will Spark Streaming supports Kafka-simple consumer API? - posted by Xuelin Cao <xu...@gmail.com> on 2015/02/05 04:44:10 UTC, 2 replies.
- Using CUDA within Spark / boosting linear algebra - posted by "Ulanov, Alexander" <al...@hp.com> on 2015/02/05 20:55:21 UTC, 28 replies.
- PSA: Maven supports parallel builds - posted by Nicholas Chammas <ni...@gmail.com> on 2015/02/06 01:16:12 UTC, 2 replies.
- spark 1.3 sbt build seems to be broken - posted by shane knapp <sk...@berkeley.edu> on 2015/02/06 02:01:46 UTC, 1 replies.
- Data source API | sizeInBytes should be to *Scan - posted by Aniket Bhatnagar <an...@gmail.com> on 2015/02/06 12:39:27 UTC, 4 replies.
- Improving metadata in Spark JIRA - posted by Sean Owen <so...@cloudera.com> on 2015/02/06 15:45:20 UTC, 11 replies.
- Unit tests - posted by Patrick Wendell <pw...@gmail.com> on 2015/02/06 21:55:11 UTC, 3 replies.
- [RESULT] [VOTE] Release Apache Spark 1.2.1 (RC3) - posted by Patrick Wendell <pw...@gmail.com> on 2015/02/07 02:14:39 UTC, 0 replies.
- Spark SQL Window Functions - posted by "Evan R. Sparks" <ev...@gmail.com> on 2015/02/07 02:29:09 UTC, 1 replies.
- Pull Requests on github - posted by fommil <sa...@gmail.com> on 2015/02/07 02:37:05 UTC, 3 replies.
- Keep or remove Debian packaging in Spark? - posted by Sean Owen <so...@cloudera.com> on 2015/02/09 11:41:54 UTC, 8 replies.
- Re: run time exceptions in Spark 1.2.0 manual build together with OpenStack hadoop driver - posted by Gil Vernik <GI...@il.ibm.com> on 2015/02/09 12:22:31 UTC, 1 replies.
- pyspark.daemon issues? - posted by mkhaitman <ma...@chango.com> on 2015/02/09 22:26:48 UTC, 0 replies.
- [ANNOUNCE] Apache Spark 1.2.1 Released - posted by Patrick Wendell <pw...@gmail.com> on 2015/02/09 22:31:05 UTC, 0 replies.
- Re: spark-ec2 licensing clarification - posted by Shivaram Venkataraman <sh...@eecs.berkeley.edu> on 2015/02/10 01:00:13 UTC, 0 replies.
- adding some temporary jenkins worker nodes... - posted by shane knapp <sk...@berkeley.edu> on 2015/02/10 02:18:14 UTC, 0 replies.
- Mail to user@spark.apache.org failing - posted by Meethu Mathew <me...@flytxt.com> on 2015/02/10 06:00:58 UTC, 1 replies.
- New Metrics Sink class not packaged in spark-assembly jar - posted by Judy Nash <ju...@exchange.microsoft.com> on 2015/02/10 07:02:37 UTC, 0 replies.
- Powered by Spark: Concur - posted by Denny Lee <de...@gmail.com> on 2015/02/10 07:11:41 UTC, 5 replies.
- Re: New Metrics Sink class not packaged in spark-assembly jar - posted by Patrick Wendell <pw...@gmail.com> on 2015/02/10 07:42:51 UTC, 2 replies.
- Spark On HPC Podcast - posted by Brock Palen <br...@umich.edu> on 2015/02/10 14:39:48 UTC, 0 replies.
- Batch prediciton for ALS - posted by Debasish Das <de...@gmail.com> on 2015/02/10 17:01:13 UTC, 4 replies.
- FYI: Prof John Canny is giving a talk on "Machine Learning at the limit" in SF Big Analytics Meetup - posted by Chester Chen <ch...@alpinenow.com> on 2015/02/10 19:11:16 UTC, 0 replies.
- new committer criteria - posted by Imran Rashid <im...@therashids.com> on 2015/02/10 19:44:02 UTC, 0 replies.
- Spark Summit East - March 18-19 - NYC - posted by Scott walent <sc...@gmail.com> on 2015/02/11 00:08:15 UTC, 0 replies.
- Build spark failed with maven - posted by Yi Tian <ti...@gmail.com> on 2015/02/11 05:08:32 UTC, 1 replies.
- CallbackServer in PySpark Streaming - posted by Todd Gao <to...@gmail.com> on 2015/02/11 09:32:49 UTC, 4 replies.
- [GraphX] Estimating Average distance of a big graph using GraphX - posted by James <al...@gmail.com> on 2015/02/11 12:30:19 UTC, 0 replies.
- [ml] Lost persistence for fold in crossvalidation. - posted by Peter Rudenko <pe...@gmail.com> on 2015/02/11 20:13:03 UTC, 2 replies.
- numpy on PyPy - potential benefit to PySpark - posted by Nicholas Chammas <ni...@gmail.com> on 2015/02/11 21:07:48 UTC, 0 replies.
- 1.2.1 start-all.sh broken? - posted by Nicholas Chammas <ni...@gmail.com> on 2015/02/11 23:27:29 UTC, 9 replies.
- [ANNOUNCE] Spark 1.3.0 Snapshot 1 - posted by Patrick Wendell <pw...@gmail.com> on 2015/02/12 04:32:28 UTC, 0 replies.
- Re: Re: Sort Shuffle performance issues about using AppendOnlyMap for large data sets - posted by "fightfate@163.com" <fi...@163.com> on 2015/02/12 06:37:16 UTC, 2 replies.
- driver fail-over in Spark streaming 1.2.0 - posted by lin <ku...@gmail.com> on 2015/02/12 08:24:31 UTC, 1 replies.
- How to track issues that must wait for Spark 2.x in JIRA? - posted by Sean Owen <so...@cloudera.com> on 2015/02/12 09:42:12 UTC, 3 replies.
- Why a program would receive null from send message of mapReduceTriplets - posted by James <al...@gmail.com> on 2015/02/12 15:26:50 UTC, 5 replies.
- Spark SQL value proposition in batch pipelines - posted by vha14 <vh...@msn.com> on 2015/02/12 17:56:12 UTC, 3 replies.
- Re: Optimize encoding/decoding strings when using Parquet - posted by Mick Davies <mi...@gmail.com> on 2015/02/13 10:40:24 UTC, 0 replies.
- mllib.recommendation Design - posted by Debasish Das <de...@gmail.com> on 2015/02/13 16:46:41 UTC, 2 replies.
- FW: Trouble posting to the list - posted by "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov> on 2015/02/13 21:14:50 UTC, 0 replies.
- Spark & Hive - posted by The Watcher <wa...@gmail.com> on 2015/02/15 12:03:41 UTC, 1 replies.
- Re: A Spark Compilation Question - posted by vha14 <vh...@msn.com> on 2015/02/15 21:15:58 UTC, 0 replies.
- Replacing Jetty with TomCat - posted by Niranda Perera <ni...@gmail.com> on 2015/02/16 06:08:50 UTC, 11 replies.
- HiveContext cannot be serialized - posted by Haopu Wang <HW...@qilinsoft.com> on 2015/02/16 13:27:36 UTC, 5 replies.
- Spark Receivers - posted by Mark Payne <ma...@g.uky.edu> on 2015/02/16 17:22:08 UTC, 0 replies.
- org.apache.spark.sql.sources.DDLException: Unsupported dataType: [1.1] failure: ``varchar'' expected but identifier char found in spark-sql - posted by Qiuzhuang Lian <qi...@gmail.com> on 2015/02/17 07:39:52 UTC, 1 replies.
- Fwd: [MLlib] Performance problem in GeneralizedLinearAlgorithm - posted by Josh Devins <jo...@soundcloud.com> on 2015/02/17 15:36:17 UTC, 3 replies.
- JavaRDD Aggregate initial value - Closure-serialized zero value reasoning? - posted by Matt Cheah <mc...@palantir.com> on 2015/02/18 05:31:45 UTC, 5 replies.
- [VOTE] Release Apache Spark 1.3.0 (RC1) - posted by Patrick Wendell <pw...@gmail.com> on 2015/02/18 09:12:08 UTC, 25 replies.
- Merging code into branch 1.3 - posted by Patrick Wendell <pw...@gmail.com> on 2015/02/18 09:21:26 UTC, 0 replies.
- Streaming partitions to driver for use in .toLocalIterator - posted by Andrew Ash <an...@andrewash.com> on 2015/02/18 16:09:39 UTC, 3 replies.
- Issue SPARK-5008 (persistent-hdfs broken) - posted by Joe Wass <jw...@crossref.org> on 2015/02/18 16:39:45 UTC, 0 replies.
- quick jenkins restart tomorrow morning, ~7am PST - posted by shane knapp <sk...@berkeley.edu> on 2015/02/18 21:55:27 UTC, 2 replies.
- [Performance] Possible regression in rdd.take()? - posted by Matt Cheah <mc...@palantir.com> on 2015/02/19 01:47:11 UTC, 4 replies.
- Spark-SQL 1.2.0 "sort by" results are not consistent with Hive - posted by Kannan Rajah <kr...@maprtech.com> on 2015/02/19 03:33:21 UTC, 0 replies.
- If job fails shuffle space is not cleaned - posted by Debasish Das <de...@gmail.com> on 2015/02/19 08:20:55 UTC, 0 replies.
- RE: spark slave cannot execute without admin permission on windows - posted by Judy Nash <ju...@exchange.microsoft.com> on 2015/02/19 09:26:11 UTC, 1 replies.
- Hive SKEWED feature supported in Spark SQL ? - posted by The Watcher <wa...@gmail.com> on 2015/02/19 11:25:38 UTC, 1 replies.
- Have Friedman's glmnet algo running in Spark - posted by mi...@mbowles.com on 2015/02/19 19:59:08 UTC, 6 replies.
- Spark SQL, Hive & Parquet data types - posted by The Watcher <wa...@gmail.com> on 2015/02/19 23:50:27 UTC, 6 replies.
- The default CDH4 build uses avro-mapred hadoop1 - posted by Mingyu Kim <mk...@palantir.com> on 2015/02/20 09:30:23 UTC, 2 replies.
- OSGI bundles for spark project.. - posted by Niranda Perera <ni...@gmail.com> on 2015/02/20 09:33:10 UTC, 2 replies.
- Spark performance on 32 Cpus Server Cluster - posted by Dirceu Semighini Filho <di...@gmail.com> on 2015/02/20 11:01:25 UTC, 3 replies.
- Spark 1.3 RC1 Generate schema based on string of schema - posted by Denny Lee <de...@gmail.com> on 2015/02/20 16:51:32 UTC, 1 replies.
- GSOC2015 - posted by magellane a <ma...@gmail.com> on 2015/02/21 17:13:12 UTC, 1 replies.
- Spark SQL - Long running job - posted by nitin <ni...@gmail.com> on 2015/02/21 17:55:49 UTC, 3 replies.
- Google Summer of Code - ideas - posted by Manoj Kumar <ma...@gmail.com> on 2015/02/21 20:24:30 UTC, 6 replies.
- Git Achievements - posted by Nicholas Chammas <ni...@gmail.com> on 2015/02/22 09:33:50 UTC, 0 replies.
- textFile() ordering and header rows - posted by Michael Malak <mi...@yahoo.com.INVALID> on 2015/02/23 03:04:48 UTC, 1 replies.
- StreamingContext textFileStream question - posted by mkhaitman <ma...@chango.com> on 2015/02/23 19:53:31 UTC, 2 replies.
- [jenkins infra -- pls read ] installing anaconda, moving default python from 2.6 -> 2.7 - posted by shane knapp <sk...@berkeley.edu> on 2015/02/23 20:13:02 UTC, 3 replies.
- Does Spark delete shuffle files of lost executor in running system(on YARN)? - posted by nitin <ni...@gmail.com> on 2015/02/24 16:46:54 UTC, 0 replies.
- PySpark SPARK_CLASSPATH doesn't distribute jars to executors - posted by Michael Nazario <mn...@palantir.com> on 2015/02/24 23:54:23 UTC, 2 replies.
- [ERROR] bin/compute-classpath.sh: fails with false positive test for java 1.7 vs 1.6 - posted by Mike Hynes <91...@gmail.com> on 2015/02/25 00:02:12 UTC, 3 replies.
- Help vote for Spark talks at the Hadoop Summit - posted by Reynold Xin <rx...@databricks.com> on 2015/02/25 06:54:39 UTC, 1 replies.
- UnusedStubClass in 1.3.0-rc1 - posted by Cody Koeninger <co...@koeninger.org> on 2015/02/25 16:53:05 UTC, 3 replies.
- Some praise and comments on Spark - posted by Devl Devel <de...@gmail.com> on 2015/02/25 23:13:09 UTC, 2 replies.
- Need advice for Spark newbie - posted by Vikram Kone <vi...@gmail.com> on 2015/02/26 08:56:31 UTC, 5 replies.
- number of partitions for hive schemaRDD - posted by masaki rikitoku <ri...@gmail.com> on 2015/02/26 10:31:44 UTC, 1 replies.
- graph.mapVertices() function obtain edge triplets with null attribute - posted by James <al...@gmail.com> on 2015/02/26 13:38:26 UTC, 0 replies.
- Re: Scheduler hang? - posted by Victor Tso-Guillen <vt...@paxata.com> on 2015/02/26 22:37:11 UTC, 3 replies.
- Monitoring Spark with Graphite and Grafana - posted by Ryan Williams <ry...@gmail.com> on 2015/02/27 03:10:58 UTC, 1 replies.
- trouble with sbt building network-* projects? - posted by Imran Rashid <ir...@cloudera.com> on 2015/02/27 20:10:00 UTC, 3 replies.
- How to create a Row from a List or Array in Spark using Scala - posted by "r7raul1984@163.com" <r7...@163.com> on 2015/02/28 08:58:43 UTC, 1 replies.