You are viewing a plain text version of this content. The canonical link for it is here.
- Spark incorrectly collects data from Hadoop SequenceFile - posted by Sergey Parhomenko <sp...@gmail.com> on 2013/10/01 00:06:23 UTC, 1 replies.
- Question about SparkContext.stop() - posted by Xiang Huo <hu...@gmail.com> on 2013/10/01 00:45:39 UTC, 0 replies.
- Spark Streaming architecture question - shared memory model - posted by Domingo Mihovilovic <do...@exabeam.com> on 2013/10/01 00:52:51 UTC, 2 replies.
- Help required to deploy code to Standalone Cluster - posted by purav aggarwal <pu...@gmail.com> on 2013/10/01 10:46:20 UTC, 0 replies.
- SizeEstimator.scala - posted by Jakub Liska <li...@gmail.com> on 2013/10/01 18:42:17 UTC, 2 replies.
- Spark performance on smallerish data sets: EC2 Mediums - posted by Gary Malouf <ma...@gmail.com> on 2013/10/01 19:54:39 UTC, 3 replies.
- Spark and mesos cluster utilization - posted by Allen Charles - callen <Ch...@acxiom.com> on 2013/10/01 20:43:39 UTC, 0 replies.
- Join on DStream and RDD - posted by Nicholas Pritchard <ni...@falkonry.com> on 2013/10/02 00:37:52 UTC, 1 replies.
- Building Spark 0.8 - posted by Stuart Layton <st...@bitsighttech.com> on 2013/10/02 05:34:57 UTC, 3 replies.
- spark-0.8.0 Twitter examples showing errors - posted by prabeesh k <pr...@gmail.com> on 2013/10/02 07:36:59 UTC, 3 replies.
- spark-0.8.0 -- Beginner mistake when trying to run NetworkWordCount example - posted by prabeesh k <pr...@gmail.com> on 2013/10/02 07:59:22 UTC, 0 replies.
- Accessing broadcast variables by name - posted by Elmer Garduno <ga...@gmail.com> on 2013/10/02 14:52:41 UTC, 4 replies.
- Roadblock with Spark 0.8.0 ActorStream - posted by Paul Snively <ps...@icloud.com> on 2013/10/02 19:37:10 UTC, 10 replies.
- Re: Some questions about task distribution and execution in Spark - posted by Matei Zaharia <ma...@gmail.com> on 2013/10/02 22:00:02 UTC, 7 replies.
- java.lang.AbstractMethodError - posted by Eduardo Berrocal <eb...@hawk.iit.edu> on 2013/10/03 20:31:09 UTC, 6 replies.
- Troubleshooting and how to interpret the logs - posted by Ashish Rangole <ar...@gmail.com> on 2013/10/03 22:10:55 UTC, 2 replies.
- Sort order of RDD rows - posted by Mingyu Kim <mk...@palantir.com> on 2013/10/04 00:33:55 UTC, 2 replies.
- How to prevent webUI from coming up - posted by Jesvin Jose <fr...@gmail.com> on 2013/10/04 08:18:16 UTC, 1 replies.
- Naive Bayes Classifier with mllib - posted by Aslan Bekirov <as...@gmail.com> on 2013/10/04 10:47:47 UTC, 1 replies.
- Loss was due to com.esotericsoftware.kryo.KryoException: Buffer overflow. - posted by Ryan Compton <co...@gmail.com> on 2013/10/05 02:31:30 UTC, 6 replies.
- spark-ec2 launch script ... some issues and comments - posted by Shay Seng <sh...@1618labs.com> on 2013/10/05 03:00:55 UTC, 1 replies.
- spark through vpn, SPARK_LOCAL_IP - posted by Aaron Babcock <aa...@gmail.com> on 2013/10/05 23:45:31 UTC, 3 replies.
- Meet at CIKM - posted by Sebastian Schelter <ss...@apache.org> on 2013/10/06 13:17:22 UTC, 1 replies.
- Output to a single directory with multiple files rather multiple directories ? - posted by Ramkumar Chokkalingam <ra...@gmail.com> on 2013/10/07 05:51:14 UTC, 11 replies.
- HCatalog and spark - posted by Chester <ch...@yahoo.com> on 2013/10/07 07:32:26 UTC, 0 replies.
- Deploying spark on EC2 failed - posted by Hao REN <ju...@gmail.com> on 2013/10/07 16:40:20 UTC, 2 replies.
- Testing spark streaming jobs - posted by Ryan Weald <ry...@weald.com> on 2013/10/07 23:58:11 UTC, 0 replies.
- Spark dependency library causing problems with conflicting versions at import - posted by Mingyu Kim <mk...@palantir.com> on 2013/10/08 03:18:21 UTC, 3 replies.
- The functionality of daemon.py? - posted by Shangyu Luo <ls...@gmail.com> on 2013/10/08 06:50:26 UTC, 3 replies.
- Spark in a heterogeneous computing environment - posted by Markus Losoi <ma...@gmail.com> on 2013/10/08 08:32:15 UTC, 1 replies.
- Configuring log4j for Spark 0.8.0 examples - posted by Markus Losoi <ma...@gmail.com> on 2013/10/08 09:35:48 UTC, 0 replies.
- spark_ec2 script in 0.8.0 and mesos - posted by Shay Seng <sh...@1618labs.com> on 2013/10/08 19:15:23 UTC, 2 replies.
- How would I start writing a RDD[ProtoBuf] and/or sc.newAPIHadoopFile?? - posted by Shay Seng <sh...@1618labs.com> on 2013/10/09 03:16:39 UTC, 1 replies.
- GPU-awareness - posted by Patrick Grinaway <pg...@gmail.com> on 2013/10/09 21:01:41 UTC, 1 replies.
- spark 0.8.0 null pointer exception when accessing mondodb twice - posted by Yadid Ayzenberg <ya...@media.mit.edu> on 2013/10/09 23:15:57 UTC, 0 replies.
- 0.8 in Maven Repo? - posted by Erik Freed <er...@codecision.com> on 2013/10/10 02:25:26 UTC, 1 replies.
- Running pi example error with spark 0.8.0 cdh4 version - posted by Shangyu Luo <ls...@gmail.com> on 2013/10/10 08:44:08 UTC, 2 replies.
- Not Able to setup spark standalone Cluster(Newbie) - posted by vinayak navale <vi...@gmail.com> on 2013/10/10 11:14:18 UTC, 3 replies.
- Controlling the name of the output file - posted by Ramkumar Chokkalingam <ra...@gmail.com> on 2013/10/10 13:18:03 UTC, 1 replies.
- Output configuration - posted by Alex Levin <st...@gmail.com> on 2013/10/10 13:29:20 UTC, 2 replies.
- Re: Haven't got a response for any of my mails, reaching the group ? - posted by Henry Saputra <he...@gmail.com> on 2013/10/10 19:32:02 UTC, 0 replies.
- Execution time of spark job - posted by prabeesh k <pr...@gmail.com> on 2013/10/11 06:04:22 UTC, 2 replies.
- Standalone Cluster check your cluster UI to ensure that workers are registered error Only For Shark - posted by vinayak navale <vi...@gmail.com> on 2013/10/11 09:14:13 UTC, 2 replies.
- Write to HBase from spark job - posted by Eugen Cepoi <ce...@gmail.com> on 2013/10/11 17:53:21 UTC, 2 replies.
- Kafka dependency issues - posted by Ryan Weald <ry...@weald.com> on 2013/10/12 00:14:14 UTC, 4 replies.
- ReduceByKey OOME workaround: 'sample and subtract', but rdd.subtract does not subtract. - posted by Stanley Burnitt <St...@huawei.com> on 2013/10/12 01:51:34 UTC, 0 replies.
- Spark REPL produces error on a piece of scala code that works in pure Scala REPL - posted by Shay Seng <sh...@1618labs.com> on 2013/10/12 01:55:20 UTC, 5 replies.
- Spark 0.8.0 on Mesos 0.13.0 (clustered) : NoClassDefFoundError - posted by Bart Vercammen <ba...@portico.io> on 2013/10/12 09:52:00 UTC, 2 replies.
- Large input file problem - posted by Grega Kešpret <gr...@celtra.com> on 2013/10/13 01:51:29 UTC, 4 replies.
- Drawback of Spark memory model as compared to Hadoop? - posted by howard chen <ho...@gmail.com> on 2013/10/13 05:24:11 UTC, 1 replies.
- Persist to disk causes memory problems - posted by Kyle Ellrott <ke...@soe.ucsc.edu> on 2013/10/13 08:48:44 UTC, 3 replies.
- Spark & Spark Streaming, how to get started for local development? - posted by Ryan Chan <ry...@gmail.com> on 2013/10/14 06:17:20 UTC, 1 replies.
- Suggested Filesystem Layout for Spark Cluster Node - posted by Craig Vanderborgh <cr...@gmail.com> on 2013/10/14 20:28:31 UTC, 5 replies.
- Job hangs for larger input size - posted by Ramkumar Chokkalingam <ra...@gmail.com> on 2013/10/14 21:30:08 UTC, 1 replies.
- A program that works in local mode but fails in distributed mode - posted by Markus Losoi <ma...@gmail.com> on 2013/10/15 10:27:08 UTC, 5 replies.
- Another instance of Derby may have already booted the database /opt/spark/shark/bin/metastore_db. - posted by vinayak navale <vi...@gmail.com> on 2013/10/15 12:05:59 UTC, 1 replies.
- pyspark memory usage - posted by eshishki <it...@gmail.com> on 2013/10/15 16:29:56 UTC, 3 replies.
- Logging from Spark Jobs on a Mesos Cluster - posted by Craig Vanderborgh <cr...@gmail.com> on 2013/10/15 22:03:51 UTC, 1 replies.
- Questions about list file in FileInputDStream - posted by Erix Yao <ya...@gmail.com> on 2013/10/16 10:59:31 UTC, 0 replies.
- Re: Announcing the first Spark Summit, Mon Dec 2, 2013 - posted by Matei Zaharia <ma...@gmail.com> on 2013/10/17 01:28:45 UTC, 1 replies.
- Spark and Juju - posted by Maarten Ectors <ma...@canonical.com> on 2013/10/17 18:29:26 UTC, 2 replies.
- executor memory in standalone mode stays at default 512MB - posted by Ameet Kini <am...@gmail.com> on 2013/10/17 20:58:45 UTC, 1 replies.
- job reports as KILLED in standalone mode - posted by Ameet Kini <am...@gmail.com> on 2013/10/17 23:38:38 UTC, 4 replies.
- spark 0.8 - posted by Koert Kuipers <ko...@tresata.com> on 2013/10/18 00:05:39 UTC, 21 replies.
- building spark - posted by Umar Javed <um...@gmail.com> on 2013/10/18 00:07:19 UTC, 3 replies.
- help on SparkContext.sequenceFile() - posted by Shay Seng <sh...@1618labs.com> on 2013/10/18 02:35:35 UTC, 3 replies.
- Spark - Loading in data from CSVs and Postgres - posted by Victor Hooi <vi...@yahoo.com> on 2013/10/18 07:24:08 UTC, 4 replies.
- PySpark sequence file support - posted by Peter Aberline <pe...@gmail.com> on 2013/10/18 11:10:49 UTC, 3 replies.
- examples of map-side join of two hadoop sequence files - posted by Ameet Kini <am...@gmail.com> on 2013/10/18 23:20:02 UTC, 5 replies.
- Fwd: snappy - posted by Koert Kuipers <ko...@tresata.com> on 2013/10/19 01:30:44 UTC, 2 replies.
- Shark 0.8.0 release - posted by Reynold Xin <rx...@apache.org> on 2013/10/19 04:07:38 UTC, 0 replies.
- Log Streaming Analysis - posted by prabeesh k <pr...@gmail.com> on 2013/10/19 06:18:01 UTC, 0 replies.
- help me find the lineage graph code in spark - posted by dachuan <hd...@gmail.com> on 2013/10/20 02:53:48 UTC, 2 replies.
- Kryo Serializer - posted by Wenlei Xie <we...@gmail.com> on 2013/10/20 07:24:42 UTC, 1 replies.
- Support for gz files ? - posted by Ramkumar Chokkalingam <ra...@gmail.com> on 2013/10/21 08:58:23 UTC, 2 replies.
- Spark unit test question - posted by Shay Seng <sh...@1618labs.com> on 2013/10/21 19:30:46 UTC, 3 replies.
- RDD sample fraction precision - posted by Matt Cheah <mc...@palantir.com> on 2013/10/21 21:01:17 UTC, 2 replies.
- Help with Initial Cluster Configuration / Tuning - posted by Timothy Perrigo <tp...@gmail.com> on 2013/10/21 22:05:38 UTC, 11 replies.
- Spark map function question - posted by Wisc Forum <wi...@gmail.com> on 2013/10/22 06:18:49 UTC, 1 replies.
- Connecting a worker to a master that was bound to the loopback address - posted by Markus Losoi <ma...@gmail.com> on 2013/10/22 08:17:07 UTC, 0 replies.
- Broken link in quickstart - posted by Sebastian Schelter <ss...@apache.org> on 2013/10/22 09:29:02 UTC, 3 replies.
- Spark Streaming - How to control the parallelism like storm - posted by Ryan Chan <ry...@gmail.com> on 2013/10/22 16:24:47 UTC, 4 replies.
- unable to serialize analytics pipeline - posted by Philip Ogren <ph...@oracle.com> on 2013/10/22 19:50:10 UTC, 2 replies.
- Visitor function to RDD elements - posted by Matt Cheah <mc...@palantir.com> on 2013/10/22 21:28:43 UTC, 14 replies.
- Stage failures - posted by Tom Vacek <mi...@gmail.com> on 2013/10/22 22:17:17 UTC, 4 replies.
- Join, order - posted by Pavel Ajtkulov <aj...@gmail.com> on 2013/10/23 10:14:04 UTC, 0 replies.
- JavaPairRDD unpersist - posted by Yann Luppo <Ya...@LiveNation.com> on 2013/10/23 23:01:55 UTC, 2 replies.
- getting Caused by: org.apache.spark.SparkException: Job failed: Task 1.0:1 failed more than 4 times - posted by Hu...@Dell.com on 2013/10/23 23:02:25 UTC, 2 replies.
- setting SPARK_JAVA_OPTS spark-env.sh not working as expected - posted by Hu...@Dell.com on 2013/10/23 23:15:10 UTC, 1 replies.
- solution to write data to S3? - posted by Nan Zhu <zh...@gmail.com> on 2013/10/24 03:17:18 UTC, 6 replies.
- dynamically resizing Spark cluster - posted by Nan Zhu <zh...@gmail.com> on 2013/10/24 05:31:32 UTC, 0 replies.
- Loading leveldb into Spark.. - posted by "Himanshu Bafna (HB)" <hb...@yahoo.com> on 2013/10/24 07:36:40 UTC, 2 replies.
- Failed to build Spark with YARN 2.2.0 - posted by Pei-Lun Lee <pl...@appier.com> on 2013/10/24 08:39:02 UTC, 2 replies.
- Take last k elements from RDD? - posted by Matt Cheah <mc...@palantir.com> on 2013/10/25 00:23:55 UTC, 3 replies.
- Fwd: Spark Build using Scala 2.10 on Windows - posted by Yogesh Shetty <yo...@gmail.com> on 2013/10/25 02:11:07 UTC, 2 replies.
- Swapping of Spark Streaming job - posted by Ryan Chan <ry...@gmail.com> on 2013/10/25 07:12:02 UTC, 2 replies.
- almost sorted data - posted by Arun Kumar <ar...@gmail.com> on 2013/10/25 11:01:38 UTC, 8 replies.
- help with sbt - posted by Umar Javed <um...@gmail.com> on 2013/10/25 11:29:11 UTC, 7 replies.
- gc/oome from 14,000 DiskBlockObjectWriters - posted by Stephen Haberman <st...@gmail.com> on 2013/10/25 16:41:38 UTC, 2 replies.
- Compare with Storm - posted by howard chen <ho...@gmail.com> on 2013/10/25 17:32:59 UTC, 1 replies.
- understanding spark internals - posted by Umar Javed <um...@gmail.com> on 2013/10/25 21:27:44 UTC, 5 replies.
- Spark integration with HDFS and Cassandra simultaneously - posted by Gary Malouf <ma...@gmail.com> on 2013/10/26 02:15:15 UTC, 7 replies.
- accessing spark ui over an ssh tunnel - posted by Stephen Haberman <st...@gmail.com> on 2013/10/26 04:40:15 UTC, 3 replies.
- oome from blockmanager - posted by Stephen Haberman <st...@gmail.com> on 2013/10/26 20:43:53 UTC, 11 replies.
- set up spark in eclipse - posted by Arun Kumar <ar...@gmail.com> on 2013/10/27 06:20:11 UTC, 5 replies.
- Dependency while creating jar duplicate file. - posted by Ramkumar Chokkalingam <ra...@gmail.com> on 2013/10/27 09:04:40 UTC, 5 replies.
- Reading custom inputformat from hadoop dfs - posted by Arun Kumar <ar...@gmail.com> on 2013/10/28 09:52:25 UTC, 1 replies.
- exception when reading HDFS data from local model - posted by Jianmin Wu <ji...@optaim.com> on 2013/10/28 16:32:49 UTC, 0 replies.
- Job duration - posted by Lucas Fernandes Brunialti <lb...@igcorp.com.br> on 2013/10/28 17:11:09 UTC, 8 replies.
- Task output before a shuffle - posted by Ufuk Celebi <u....@fu-berlin.de> on 2013/10/28 17:25:33 UTC, 2 replies.
- compare/contrast Spark with Cascading - posted by Philip Ogren <ph...@oracle.com> on 2013/10/28 18:11:12 UTC, 15 replies.
- Modeling and implementation - posted by Amit Mor <am...@gmail.com> on 2013/10/28 19:15:01 UTC, 0 replies.
- Data processing conventions with Spark. - posted by Sriram Ramachandrasekaran <sr...@gmail.com> on 2013/10/28 19:17:27 UTC, 1 replies.
- Questions about the files that Spark will produce during its running - posted by Shangyu Luo <ls...@gmail.com> on 2013/10/29 02:52:20 UTC, 2 replies.
- help with Spark serialize problem (StreamingCorruptedException) - posted by "Shao, Saisai" <sa...@intel.com> on 2013/10/29 07:30:38 UTC, 0 replies.
- Reading corrupted hadoop sequence files - posted by Arun Kumar <ar...@gmail.com> on 2013/10/29 12:03:37 UTC, 0 replies.
- met a problem while running a streaming example program - posted by dachuan <hd...@gmail.com> on 2013/10/29 14:54:50 UTC, 4 replies.
- Re: spark-0.8.0 and hadoop-2.1.0-beta - posted by Matei Zaharia <ma...@gmail.com> on 2013/10/29 19:34:35 UTC, 2 replies.
- Spark cluster memory configuration for spark-shell - posted by Soumya Simanta <so...@gmail.com> on 2013/10/29 20:40:28 UTC, 1 replies.
- executor failures w/ scala 2.10 - posted by Imran Rashid <im...@quantifind.com> on 2013/10/29 20:59:15 UTC, 16 replies.
- Getting exception org.apache.spark.SparkException: Job aborted: Task 1.0:37 failed more than 4 times - posted by Soumya Simanta <so...@gmail.com> on 2013/10/29 21:17:10 UTC, 1 replies.
- Spark v0.7.3: spark.SparkException Bug - posted by Craig Vanderborgh <cr...@gmail.com> on 2013/10/29 21:41:51 UTC, 0 replies.
- Spark Checkpointing Bug - posted by Craig Vanderborgh <cr...@gmail.com> on 2013/10/29 21:51:35 UTC, 1 replies.
- Program hangs when using local[2] - posted by Ramkumar Chokkalingam <ra...@gmail.com> on 2013/10/30 09:20:46 UTC, 2 replies.
- How to exclude a library from "sbt assembly" - posted by Mingyu Kim <mk...@palantir.com> on 2013/10/30 17:58:01 UTC, 2 replies.
- API for a Web editor for Spark - posted by Romain Rigaux <ro...@gmail.com> on 2013/10/30 20:25:50 UTC, 0 replies.
- Multiple users of the Spark REPL - posted by Ryan Prenger <ry...@tracevector.com> on 2013/10/31 01:02:14 UTC, 0 replies.
- Save RDDs as CSV - posted by Shay Seng <sh...@1618labs.com> on 2013/10/31 02:34:31 UTC, 9 replies.
- Failed to Register spark.kryo.registrator and EOFException - posted by Robin Cjc <cj...@gmail.com> on 2013/10/31 08:30:53 UTC, 0 replies.
- repartitioning RDDS - posted by Daniel Mahler <dm...@gmail.com> on 2013/10/31 10:54:47 UTC, 3 replies.
- Worker lost during processing large input - posted by Bo Lu <bl...@etinternational.com> on 2013/10/31 18:21:00 UTC, 0 replies.