You are viewing a plain text version of this content. The canonical link for it is here.
- Distributed streaming quantiles with PySpark - posted by Uri Laserson <la...@cloudera.com> on 2014/02/01 01:33:23 UTC, 1 replies.
- RE: Error: Could not find or load main class org.apache.spark.executor.CoarseGrainedExecutorBackend - posted by Hu...@Dell.com on 2014/02/01 02:43:42 UTC, 0 replies.
- Re: Python API Performance - posted by Jeremy Freeman <fr...@gmail.com> on 2014/02/01 08:42:21 UTC, 6 replies.
- default parallelism in trunk - posted by Koert Kuipers <ko...@tresata.com> on 2014/02/01 09:30:55 UTC, 3 replies.
- java BindException pyspark - posted by Izhar ul Hassan <ez...@gmail.com> on 2014/02/01 16:04:54 UTC, 0 replies.
- question on Spark worker node core settings. - posted by Wisc Forum <wi...@gmail.com> on 2014/02/01 16:48:30 UTC, 0 replies.
- Spark support for Kerberos secured HDFS cluster - posted by Manoj Samel <ma...@gmail.com> on 2014/02/01 19:15:50 UTC, 1 replies.
- (Unknown) - posted by Paul Kullich <pa...@tumra.com> on 2014/02/01 20:09:49 UTC, 1 replies.
- object file not loading correctly - posted by zhen <z....@latrobe.edu.au> on 2014/02/01 22:46:23 UTC, 2 replies.
- Hadoop MapReduce on Spark - posted by nileshc <ni...@nileshc.com> on 2014/02/02 00:57:08 UTC, 5 replies.
- EOFException when deserializing (simple) task - posted by Sandy Ryza <sa...@cloudera.com> on 2014/02/02 04:15:05 UTC, 1 replies.
- Re: Spark app gets slower as it gets executed more times - posted by 尹绪森 <yi...@gmail.com> on 2014/02/02 05:27:23 UTC, 4 replies.
- PageView streaming sample lost page views - posted by OlegYch <ol...@gmail.com> on 2014/02/02 16:09:42 UTC, 7 replies.
- Persisting RDD to Redis - posted by Luis Ángel Vicente Sánchez <la...@gmail.com> on 2014/02/02 16:18:42 UTC, 4 replies.
- Can't get a local job to parallelise (using 0.9.0 from git with parquet and avro) - posted by Hassan Syed <h....@gmail.com> on 2014/02/02 16:22:48 UTC, 8 replies.
- How to improve performance in searching for URLs. - posted by suman bharadwaj <su...@gmail.com> on 2014/02/02 20:09:56 UTC, 1 replies.
- Re: Hash Join in Spark - posted by rose kunj <ro...@yahoo.com> on 2014/02/03 05:47:21 UTC, 2 replies.
- spark streaming questions - posted by Liam Stewart <li...@gmail.com> on 2014/02/03 20:03:20 UTC, 1 replies.
- writing SparkR reducer functions - posted by Justin Lent <ju...@gmail.com> on 2014/02/04 01:38:46 UTC, 1 replies.
- ClassNotFoundException: PRCombiner - posted by Tsai Li Ming <ma...@ltsai.com> on 2014/02/04 03:08:48 UTC, 1 replies.
- spark errors: Executor X disconnected, so removing it - posted by emeric <em...@jp.fujitsu.com> on 2014/02/04 05:17:46 UTC, 0 replies.
- spark 0.9.0 on top of Mesos error Akka Actor not found - posted by Francesco Bongiovanni <bo...@gmail.com> on 2014/02/04 11:20:51 UTC, 12 replies.
- Looking for resources on Map\Reduce concepts - posted by goi cto <go...@gmail.com> on 2014/02/04 12:15:08 UTC, 4 replies.
- Broadcast variable exception - too many open files - posted by Sourav Chandra <so...@livestream.com> on 2014/02/04 13:04:06 UTC, 6 replies.
- Spark and disk cache. - posted by Mskh <ma...@yahoo.com> on 2014/02/04 13:44:19 UTC, 1 replies.
- Re: Spark + MongoDB - posted by Sampo Niskanen <sa...@wellmo.com> on 2014/02/04 14:58:32 UTC, 3 replies.
- Announcing Calliope releases 0.8.1-GA and 0.9.0-EA - posted by Rohit Rai <ro...@tuplejump.com> on 2014/02/04 15:01:20 UTC, 7 replies.
- Job hangs on spark-0.8.1 - posted by Roshan Nair <ro...@indix.com> on 2014/02/04 16:43:26 UTC, 3 replies.
- Spark Streaming StreamingContext error - posted by soojin <xa...@yahoo.com> on 2014/02/04 18:22:09 UTC, 1 replies.
- IRC channel - posted by Akhil Das <ak...@mobipulse.in> on 2014/02/04 18:22:47 UTC, 4 replies.
- Reading Tweets (JSON) in a file into RDD Spark - posted by Soumya Simanta <so...@gmail.com> on 2014/02/04 18:31:49 UTC, 3 replies.
- RDD of binary files - posted by David Thomas <dt...@gmail.com> on 2014/02/04 21:54:50 UTC, 1 replies.
- Adding external jar to spark-shell classpath using ADD_JARS - posted by Soumya Simanta <so...@gmail.com> on 2014/02/04 22:28:21 UTC, 3 replies.
- Re: Problem with running Spark over Mesos in fine-grained mode - posted by Alberto Miorin <sp...@ululi.it> on 2014/02/04 22:53:10 UTC, 0 replies.
- Local scan of an RDD (each element with successor) - posted by Adam Novak <an...@soe.ucsc.edu> on 2014/02/05 01:25:52 UTC, 2 replies.
- sbt dependencies for running Standalone app on Spark v0.9.0-incubating-SNAPSHOT - posted by Soumya Simanta <so...@gmail.com> on 2014/02/05 03:04:51 UTC, 4 replies.
- Message processing rate of spark - posted by Sourav Chandra <so...@livestream.com> on 2014/02/05 07:32:41 UTC, 4 replies.
- Scheduler delay - posted by Sourav Chandra <so...@livestream.com> on 2014/02/05 09:33:39 UTC, 0 replies.
- Re: Streaming files as a whole - posted by Tathagata Das <ta...@gmail.com> on 2014/02/05 10:05:25 UTC, 0 replies.
- Re: SparkStreaming not read hadoop configuration from its sparkContext on Stand Alone mode? - posted by Tathagata Das <ta...@gmail.com> on 2014/02/05 10:11:05 UTC, 0 replies.
- executors in single local mode - posted by aecc <al...@gmail.com> on 2014/02/05 12:15:23 UTC, 0 replies.
- Spark 0.9.0 support for windows - posted by goi cto <go...@gmail.com> on 2014/02/05 15:33:10 UTC, 5 replies.
- Could not find resource path for Web UI: org/apache/spark/ui/static - posted by zgalic <zd...@fer.hr> on 2014/02/05 15:37:27 UTC, 2 replies.
- Problem connecting to Spark Cluster from a standalone Scala program - posted by Soumya Simanta <so...@gmail.com> on 2014/02/05 16:19:42 UTC, 5 replies.
- Re: What I am missing from configuration? - posted by Dana Tontea <dt...@cylex.ro> on 2014/02/05 19:40:37 UTC, 3 replies.
- Using Parquet from an interactive Spark shell - posted by Uri Laserson <la...@cloudera.com> on 2014/02/05 21:02:55 UTC, 6 replies.
- Have anyone tried to run Spark 0.9 built with Hadoop 2.2 on Mesos 0.15 - posted by elyast <lu...@gmail.com> on 2014/02/05 21:40:08 UTC, 0 replies.
- Re: MLLib Sparse Input - posted by Imran Rashid <im...@quantifind.com> on 2014/02/05 23:11:04 UTC, 3 replies.
- Is HDFS the only possible data source for spark with python? - posted by cwhiten <ch...@gmail.com> on 2014/02/05 23:25:21 UTC, 1 replies.
- Clean up app metadata on worker nodes - posted by Mingyu Kim <mk...@palantir.com> on 2014/02/06 00:46:44 UTC, 1 replies.
- serialization exceptions in spark-shell with 0.9 - posted by Stephen Haberman <st...@gmail.com> on 2014/02/06 03:24:45 UTC, 1 replies.
- Re: [parquet-dev] Re: Using Parquet from an interactive Spark shell - posted by Uri Laserson <la...@cloudera.com> on 2014/02/06 04:44:24 UTC, 8 replies.
- data locality in logs - posted by Tsai Li Ming <ma...@ltsai.com> on 2014/02/06 07:16:37 UTC, 1 replies.
- [0.9.0] MEMORY_AND_DISK_SER not falling back to disk - posted by Andrew Ash <an...@andrewash.com> on 2014/02/06 07:29:35 UTC, 4 replies.
- Database connection per worker - posted by aecc <al...@gmail.com> on 2014/02/06 13:57:44 UTC, 2 replies.
- Errors occurred while compiling module 'spark-streaming-zeromq' (IntelliJ IDEA 13.0.2) - posted by zgalic <zd...@fer.hr> on 2014/02/06 15:04:18 UTC, 6 replies.
- spark 0.9.0 compatible with hadoop 1.0.4 ? - posted by Suhas Satish <su...@gmail.com> on 2014/02/07 03:20:04 UTC, 1 replies.
- Compiling NetworkWordCount scala code - posted by Eduardo Costa Alfaia <e....@unibs.it> on 2014/02/07 04:30:08 UTC, 1 replies.
- Akka Connection refused - standalone cluster using spark-0.9.0 - posted by Pillis W <pi...@gmail.com> on 2014/02/07 08:09:24 UTC, 8 replies.
- saving partitions separately - posted by Vipul Pandey <vi...@gmail.com> on 2014/02/07 08:58:53 UTC, 0 replies.
- trouble with broadcast variables on pyspark - posted by Sandy Ryza <sa...@cloudera.com> on 2014/02/07 09:16:52 UTC, 2 replies.
- Equivalent to Hadoop's standard counters - posted by Daniel Siegmann <da...@velos.io> on 2014/02/07 20:55:58 UTC, 2 replies.
- In Memory Caching blowing up the size - posted by Vipul Pandey <vi...@gmail.com> on 2014/02/07 21:38:10 UTC, 4 replies.
- pyspark + gevent - posted by Timothee Besset <tt...@ttimo.net> on 2014/02/07 23:14:21 UTC, 0 replies.
- Best way to implement this in Spark - posted by Soumya Simanta <so...@gmail.com> on 2014/02/08 01:28:59 UTC, 6 replies.
- unsubscribe - posted by Mohit Singh <mo...@gmail.com> on 2014/02/08 02:16:25 UTC, 3 replies.
- [ANN] Calliope release 0.9.0-C2-EA to work with Cassandra 2.0.x - posted by Rohit Rai <ro...@tuplejump.com> on 2014/02/08 13:04:38 UTC, 0 replies.
- working on a closed networkworking on a closed network - any recomendations - posted by goi cto <go...@gmail.com> on 2014/02/09 07:44:31 UTC, 2 replies.
- Shuffle file not found Exception - posted by Guillaume Pitel <gu...@exensa.com> on 2014/02/09 17:28:10 UTC, 2 replies.
- Application evaluation on Spark - posted by "haikal.pribadi" <ha...@gmail.com> on 2014/02/09 22:27:08 UTC, 1 replies.
- Connecting to an inmemory database from Spark - posted by Nithya <ni...@hp.com> on 2014/02/10 06:22:58 UTC, 1 replies.
- EOF Exception when trying to access hdfs:// - posted by mohankreddy <mr...@beanatomics.com> on 2014/02/10 09:44:31 UTC, 5 replies.
- Re: Is it possible to build with Maven? - posted by Sean Owen <so...@cloudera.com> on 2014/02/10 10:02:51 UTC, 1 replies.
- TestFunction is not a member of spark.api.java.function - posted by praveshjain1991 <pr...@gmail.com> on 2014/02/10 11:01:54 UTC, 0 replies.
- Fwd: Some Questions & Doubts regarding Spark process - posted by vinay Bajaj <vb...@gmail.com> on 2014/02/10 13:29:48 UTC, 3 replies.
- Spark Java Heap Size issue - posted by Jaggu <ja...@gmail.com> on 2014/02/10 15:02:43 UTC, 1 replies.
- problem running multiple executors on large machine - posted by Yadid Ayzenberg <ya...@media.mit.edu> on 2014/02/10 15:48:08 UTC, 0 replies.
- cannot find awaitTermination() - posted by Kal El <pi...@yahoo.com> on 2014/02/10 15:50:48 UTC, 1 replies.
- "Failed to find Spark examples assembly..." error message - posted by David Swearingen <ds...@centrifugesystems.com> on 2014/02/10 17:49:35 UTC, 3 replies.
- how is fault tolerance achieved in spark - posted by Adrian Mocanu <am...@verticalscope.com> on 2014/02/10 19:44:10 UTC, 5 replies.
- Unable to run spark in local mode - posted by kamatsuoka <ke...@gmail.com> on 2014/02/10 20:14:44 UTC, 0 replies.
- graphx missing from spark-shell - posted by Eric Kimbrel <er...@soteradefense.com> on 2014/02/11 01:10:01 UTC, 2 replies.
- Connecting App to cluster VS Launching app within cluster - posted by robin_up <ro...@gmail.com> on 2014/02/11 03:10:02 UTC, 2 replies.
- Does fair scheduling preempt running tasks? - posted by Mingyu Kim <mk...@palantir.com> on 2014/02/11 09:07:40 UTC, 0 replies.
- java.lang.ClassNotFoundException: org.apache.spark.streaming.StreamingContext - posted by Kal El <pi...@yahoo.com> on 2014/02/11 10:18:31 UTC, 1 replies.
- object scala not found - posted by Sai Prasanna <an...@gmail.com> on 2014/02/11 11:00:13 UTC, 2 replies.
- running Spark Streaming just once and stop it - posted by Kal El <pi...@yahoo.com> on 2014/02/11 12:59:32 UTC, 0 replies.
- Spark stand alone application java.lang.IncompatibleClassChangeError - posted by Jaggu <ja...@gmail.com> on 2014/02/11 14:48:04 UTC, 4 replies.
- spark streaming job - set java memory - posted by Kal El <pi...@yahoo.com> on 2014/02/11 15:21:31 UTC, 1 replies.
- more complex analytics - posted by Adrian Mocanu <am...@verticalscope.com> on 2014/02/11 16:07:12 UTC, 2 replies.
- Query execution in spark - posted by Ravi Hemnani <ra...@gmail.com> on 2014/02/11 17:31:54 UTC, 4 replies.
- does any reply to questions here? - posted by Adrian Mocanu <am...@verticalscope.com> on 2014/02/11 18:18:16 UTC, 1 replies.
- Task not serializable (java.io.NotSerializableException) - posted by David Thomas <dt...@gmail.com> on 2014/02/11 18:18:28 UTC, 4 replies.
- Measure throughput streaming - posted by aecc <al...@gmail.com> on 2014/02/11 18:52:46 UTC, 1 replies.
- How/where to set the hostname used by the spark workers? - posted by Gerard Maas <ge...@gmail.com> on 2014/02/11 19:14:42 UTC, 2 replies.
- How to log Shuffle read/write size - posted by Chen Jin <ka...@gmail.com> on 2014/02/11 20:14:46 UTC, 1 replies.
- Yarn configuration file doesn't work when run with yarn-client mode - posted by Nan Zhu <zh...@gmail.com> on 2014/02/12 07:28:25 UTC, 0 replies.
- GC issues - posted by "Livni, Dana" <da...@intel.com> on 2014/02/12 08:23:15 UTC, 2 replies.
- utf8 encoding error in serializers on pyspark 0.9.0 - posted by Julaiti Alafate <ar...@gmail.com> on 2014/02/12 09:24:29 UTC, 1 replies.
- Best practice for retrieving big data from RDD to local machine - posted by Egor Pahomov <pa...@gmail.com> on 2014/02/12 10:07:04 UTC, 1 replies.
- Slow network 1T data - posted by Andre Kuhnen <an...@gmail.com> on 2014/02/12 12:04:42 UTC, 0 replies.
- spark streaming 2.9.3 vs spark core 2.10 - posted by Adrian Mocanu <am...@verticalscope.com> on 2014/02/12 16:18:02 UTC, 2 replies.
- cassandra integration with spark - posted by Adrian Mocanu <am...@verticalscope.com> on 2014/02/12 18:01:25 UTC, 3 replies.
- Re: build shark with hive 0.11/0.12 - posted by "deenar.toraskar" <de...@db.com> on 2014/02/12 18:41:52 UTC, 0 replies.
- spark master OOME from maxMbInFlight buffers - posted by Stephen Haberman <st...@gmail.com> on 2014/02/12 20:02:50 UTC, 2 replies.
- saveAsNewAPIHadoopFile and Relative Paths on Mesos - posted by Adam Novak <an...@soe.ucsc.edu> on 2014/02/12 21:34:10 UTC, 0 replies.
- spark 0.9.0 saveAsTextFile gives classnotfound error - posted by sunjay karan <su...@intel.com> on 2014/02/12 22:39:28 UTC, 1 replies.
- Complex mapping question - posted by goi cto <go...@gmail.com> on 2014/02/12 22:47:13 UTC, 0 replies.
- Spark Release 0.9.0 missing org.apache.spark.streaming package + misleading documentation on http://spark.incubator.apache.org/releases/spark-release-0-9-0.html - posted by Adrian Mocanu <am...@verticalscope.com> on 2014/02/12 23:43:27 UTC, 5 replies.
- How could I set spark.scheduler.pool in the shark cli ? - posted by "leosandylh@gmail.com" <le...@gmail.com> on 2014/02/13 10:55:31 UTC, 0 replies.
- Java API - Serialization Issue - posted by Sonal Goyal <so...@gmail.com> on 2014/02/13 13:37:38 UTC, 0 replies.
- Spark clustering question - posted by goi cto <go...@gmail.com> on 2014/02/13 13:42:25 UTC, 2 replies.
- Spark on Tachyon - posted by arosenberger <ad...@gmail.com> on 2014/02/13 16:12:23 UTC, 0 replies.
- Too many open files - posted by "Korb, Michael [USA]" <Ko...@bah.com> on 2014/02/13 16:13:47 UTC, 1 replies.
- Re: [External] Re: Too many open files - posted by "Korb, Michael [USA]" <Ko...@bah.com> on 2014/02/13 18:51:54 UTC, 0 replies.
- Re: [External] Re: Too many open files - posted by Mayur Rustagi <ma...@gmail.com> on 2014/02/13 18:57:18 UTC, 4 replies.
- how to inc counter column w Calliope - posted by Adrian Mocanu <am...@verticalscope.com> on 2014/02/13 19:23:49 UTC, 2 replies.
- Mapping between two RDD? - posted by goi cto <go...@gmail.com> on 2014/02/13 19:53:49 UTC, 0 replies.
- spark 0.9.0 sbt build [error] Nonzero exit code (128) - posted by srikanth <sy...@gmail.com> on 2014/02/13 20:17:26 UTC, 4 replies.
- Re: running Spark Streaming just once and stop it - posted by Tathagata Das <ta...@gmail.com> on 2014/02/13 20:46:02 UTC, 0 replies.
- Cluster launch - posted by Guanhua Yan <gh...@lanl.gov> on 2014/02/13 22:09:24 UTC, 5 replies.
- ADD_JARS not working on 0.9 - posted by Andre Kuhnen <an...@gmail.com> on 2014/02/13 23:12:21 UTC, 12 replies.
- GraphX Graph Input RDD Partitioning - posted by Adam Novak <an...@soe.ucsc.edu> on 2014/02/13 23:41:18 UTC, 0 replies.
- Spark streaming questions - posted by Sourav Chandra <so...@livestream.com> on 2014/02/14 02:49:24 UTC, 24 replies.
- Trouble on runnig spark's hbasetest example - posted by 林武康 <vb...@gmail.com> on 2014/02/14 08:35:27 UTC, 0 replies.
- Performance and serialization: use case - posted by Pierre Borckmans <pi...@realimpactanalytics.com> on 2014/02/14 14:57:22 UTC, 5 replies.
- RDD API question - posted by Sonal Goyal <so...@gmail.com> on 2014/02/14 16:42:42 UTC, 6 replies.
- What are all the things to Monitor to keep the spark jobs from failure - posted by Akhil Das <ak...@mobipulse.in> on 2014/02/14 17:41:03 UTC, 2 replies.
- checkpoint and not running out of disk space - posted by Adrian Mocanu <am...@verticalscope.com> on 2014/02/14 17:47:13 UTC, 10 replies.
- Spark 0.8.1 on Amazon Elastic MapReduce - posted by "Deyhim, Parviz" <pa...@amazon.com> on 2014/02/14 17:53:53 UTC, 2 replies.
- [PoC] ZPark-Ztream : driving spark stream with scalaz-stream - posted by Pascal Voitot Dev <pa...@gmail.com> on 2014/02/15 01:09:57 UTC, 4 replies.
- Re: NoSuchMethodError: org.apache.commons.io.IOUtils.closeQuietly with cdh4 binary - posted by kamatsuoka <ke...@gmail.com> on 2014/02/15 02:04:25 UTC, 1 replies.
- Re: libraryDependencies configuration is different for sbt assembly vs sbt run - posted by kamatsuoka <ke...@gmail.com> on 2014/02/15 02:08:44 UTC, 0 replies.
- Inconsistent behavior when running spark on top of tachyon on top of HDFS HA - posted by elyast <lu...@gmail.com> on 2014/02/15 02:18:00 UTC, 4 replies.
- Getting started using spark for computer vision and video analytics - posted by Jaonary Rabarisoa <ja...@gmail.com> on 2014/02/15 08:43:20 UTC, 4 replies.
- Standalone cluster setup: binding to private IP - posted by David Thomas <dt...@gmail.com> on 2014/02/15 21:15:10 UTC, 10 replies.
- KryoDeserialization getting java.io.EOFException - posted by Fabrizio Milo aka misto <mi...@gmail.com> on 2014/02/15 21:24:44 UTC, 1 replies.
- Running app on the standalone cluster - posted by David Thomas <dt...@gmail.com> on 2014/02/16 00:58:44 UTC, 0 replies.
- how to best imitate a real cluster with small number of nodes - posted by dachuan <hd...@gmail.com> on 2014/02/16 04:51:48 UTC, 0 replies.
- Spark Streaming on a cluster - posted by amirtuval <am...@gmail.com> on 2014/02/16 17:48:53 UTC, 9 replies.
- Spark Cluster Size - posted by Bharath Mundlapudi <mu...@gmail.com> on 2014/02/16 21:41:56 UTC, 1 replies.
- Setting serializer in Spark shell - posted by David Thomas <dt...@gmail.com> on 2014/02/16 22:22:24 UTC, 2 replies.
- Behavior of Fetching File using local cluster - posted by Fabrizio Milo aka misto <mi...@gmail.com> on 2014/02/16 22:45:05 UTC, 1 replies.
- Re: problems with standalone cluster - posted by dachuan <hd...@gmail.com> on 2014/02/16 23:10:51 UTC, 0 replies.
- ./bin/spark-shell get killed after 8 seconds - posted by dachuan <hd...@gmail.com> on 2014/02/16 23:22:33 UTC, 0 replies.
- Basic Spark terms - Slaves, workers, executors - posted by Soumya Simanta <so...@gmail.com> on 2014/02/16 23:33:25 UTC, 3 replies.
- Using local[N] gets "Too many open files" - posted by Matthew Cheah <ma...@gmail.com> on 2014/02/17 03:14:32 UTC, 0 replies.
- Using local[N] gets "Too many open files"? - posted by Matthew Cheah <ma...@gmail.com> on 2014/02/17 03:18:02 UTC, 2 replies.
- Connecting an Application to the Cluster - posted by David Thomas <dt...@gmail.com> on 2014/02/17 03:19:01 UTC, 10 replies.
- Set up spark cluster without root access - posted by Matthew Cheah <ma...@gmail.com> on 2014/02/17 03:39:59 UTC, 1 replies.
- Please help running a standalone app on a Spark cluster - posted by Soumya Simanta <so...@gmail.com> on 2014/02/17 04:18:19 UTC, 7 replies.
- Resource Allocation in Mesos - Intelligent ?? - posted by Sai Prasanna <an...@gmail.com> on 2014/02/17 05:02:24 UTC, 0 replies.
- Mesos Scheduler - posted by Sai Prasanna <an...@gmail.com> on 2014/02/17 06:32:45 UTC, 0 replies.
- How to use FlumeInputDStream in spark cluster? - posted by anoldbrain <an...@gmail.com> on 2014/02/17 08:31:18 UTC, 9 replies.
- Shared access to RDD - posted by David Thomas <dt...@gmail.com> on 2014/02/17 17:41:04 UTC, 2 replies.
- 1 day window size - posted by cem <ca...@gmail.com> on 2014/02/17 17:48:36 UTC, 1 replies.
- Building a Standalone App in Scala and graphX - posted by xben <xb...@free.fr> on 2014/02/17 18:02:40 UTC, 5 replies.
- Excessive memory overheads using yarn client - posted by Issac Buenrostro <bu...@ooyala.com> on 2014/02/17 18:33:49 UTC, 0 replies.
- Interleaving stages - posted by David Thomas <dt...@gmail.com> on 2014/02/17 18:51:18 UTC, 6 replies.
- yarn documentation - posted by Koert Kuipers <ko...@tresata.com> on 2014/02/17 20:21:21 UTC, 0 replies.
- Fwd: Why is Spark not using all cores on a single machine? - posted by Johan Verwey <jo...@gmail.com> on 2014/02/17 20:50:06 UTC, 2 replies.
- Kmeans example with floats - posted by agg <ag...@gmail.com> on 2014/02/17 20:58:50 UTC, 3 replies.
- RESTful API for Spark - posted by Ognen Duzlevski <og...@nengoiksvelzud.com> on 2014/02/17 23:45:28 UTC, 4 replies.
- Quick start example (README.md count) doesn't work - posted by mohitvora <mo...@gmail.com> on 2014/02/18 00:01:14 UTC, 2 replies.
- OutOfMemoryError with basic kmeans - posted by agg <ag...@gmail.com> on 2014/02/18 01:49:59 UTC, 2 replies.
- Defining SparkShell Init? - posted by Kyle Ellrott <ke...@soe.ucsc.edu> on 2014/02/18 01:53:44 UTC, 3 replies.
- Monitor the network Communication - posted by lihu <li...@gmail.com> on 2014/02/18 04:35:41 UTC, 0 replies.
- Nodes failing when using MEMORY_AND_DISK_SER - posted by agg <ag...@gmail.com> on 2014/02/18 06:18:50 UTC, 0 replies.
- How to efficiently join this two complicated rdds - posted by hanbo <ha...@gmail.com> on 2014/02/18 08:06:42 UTC, 12 replies.
- DStream.saveAsTextFiles() saves nothing - posted by robin_up <ro...@gmail.com> on 2014/02/18 08:25:24 UTC, 4 replies.
- Re: ExternalAppendOnlyMap throw no such element - posted by guojc <gu...@gmail.com> on 2014/02/18 11:53:57 UTC, 0 replies.
- Testing if an RDD is empty? - posted by Sampo Niskanen <sa...@wellmo.com> on 2014/02/18 13:33:03 UTC, 3 replies.
- Java Spark job significantly slower than Python - posted by "Korb, Michael [USA]" <Ko...@bah.com> on 2014/02/18 18:55:08 UTC, 0 replies.
- question about compiling SimpleApp - posted by dachuan <hd...@gmail.com> on 2014/02/18 20:21:19 UTC, 6 replies.
- NetworkWordCount Tests - posted by Eduardo Costa Alfaia <e....@unibs.it> on 2014/02/18 21:00:26 UTC, 0 replies.
- reduceByKey() is not a member of org.apache.spark.streaming.dstream.DStream[(String, Int)] - posted by bethesda <sw...@mac.com> on 2014/02/18 21:03:21 UTC, 2 replies.
- Spark cannot find a class at runtime for a standalone Scala program - posted by Soumya Simanta <so...@gmail.com> on 2014/02/18 22:24:13 UTC, 1 replies.
- unit testing with spark - posted by Ameet Kini <am...@gmail.com> on 2014/02/18 22:36:10 UTC, 5 replies.
- Is DStream read from the beginning upon node crash or from where it left off - posted by Adrian Mocanu <am...@verticalscope.com> on 2014/02/18 23:31:45 UTC, 1 replies.
- Mutating RDD - posted by David Thomas <dt...@gmail.com> on 2014/02/19 03:33:31 UTC, 4 replies.
- Question on web UI - posted by David Thomas <dt...@gmail.com> on 2014/02/19 04:06:29 UTC, 3 replies.
- Resource Allocation: Spark on Mesos - posted by Sai Prasanna <an...@gmail.com> on 2014/02/19 04:13:59 UTC, 0 replies.
- Unable to submit an application to standalone cluster which on hdfs. - posted by samuel281 <sa...@gmail.com> on 2014/02/19 05:35:56 UTC, 3 replies.
- Spark Streaming windowing Driven by absolutely time? - posted by Aries Kong <ar...@gmail.com> on 2014/02/19 07:05:50 UTC, 5 replies.
- How to compile Spark applications using sbt? - posted by Tao Xiao <xi...@gmail.com> on 2014/02/19 10:21:03 UTC, 4 replies.
- OutOfMemory Error - posted by zhaoxw12 <zh...@mails.tsinghua.edu.cn> on 2014/02/19 10:51:09 UTC, 2 replies.
- Spark 0.9.0 - posted by Gino Mathews <gi...@thinkpalm.com> on 2014/02/19 10:55:28 UTC, 1 replies.
- Spark process locality - posted by vinay Bajaj <vb...@gmail.com> on 2014/02/19 10:59:07 UTC, 8 replies.
- how can I make the sliding window in Spark Streaming driven by data timestamp instead of absolute time - posted by Aries Kong <ar...@gmail.com> on 2014/02/19 13:19:32 UTC, 0 replies.
- ReduceByKey or groupByKey to Count? - posted by dmpour23 <dm...@gmail.com> on 2014/02/19 15:52:16 UTC, 5 replies.
- How is a spark node crash handled by spark wrt running DStream - posted by Adrian Mocanu <am...@verticalscope.com> on 2014/02/19 16:29:44 UTC, 0 replies.
- Building Spark reports "Could not create directory /usr/local/ims/spark/spark-0.9.0-incubating-bin-hadoop1/assembly/target/streams/compile/$global/$global" - posted by Tao Xiao <xi...@gmail.com> on 2014/02/19 16:44:49 UTC, 1 replies.
- Re: Building Spark reports "Could not create directory/usr/local/ims/spark/spark-0.9.0-incubating-bin-hadoop1/assembly/target/streams/compile/$global/$global" - posted by Nan Zhu <zh...@gmail.com> on 2014/02/19 16:53:08 UTC, 0 replies.
- Execution blocked when collect()ing some relatively big blocks on spark 0.9 - posted by Guillaume Pitel <gu...@exensa.com> on 2014/02/19 17:23:23 UTC, 10 replies.
- Why collect() has a stage but first() not? - posted by David Thomas <dt...@gmail.com> on 2014/02/19 18:55:32 UTC, 5 replies.
- Q: Discretized Streams: Fault-Tolerant Streaming Computation paper - posted by Adrian Mocanu <am...@verticalscope.com> on 2014/02/19 21:45:03 UTC, 3 replies.
- How to achieve this in Spark - posted by Soumya Simanta <so...@gmail.com> on 2014/02/19 23:23:26 UTC, 3 replies.
- Basic question on RDD caching - posted by David Thomas <dt...@gmail.com> on 2014/02/20 07:03:43 UTC, 3 replies.
- Master UI does not show any driver information - posted by Sourav Chandra <so...@livestream.com> on 2014/02/20 08:27:42 UTC, 0 replies.
- Unable to read HDFS file -- SimpleApp.java - posted by Prasad <ra...@gmail.com> on 2014/02/20 10:12:07 UTC, 2 replies.
- How I can run the sbt command on the server - posted by lihu <li...@gmail.com> on 2014/02/20 12:51:03 UTC, 3 replies.
- Fair Scheduling - posted by vinay Bajaj <vb...@gmail.com> on 2014/02/20 14:56:02 UTC, 0 replies.
- Tachyon RDD Caching - posted by arosenberger <ad...@gmail.com> on 2014/02/20 14:57:03 UTC, 0 replies.
- How to submit a job to Spark cluster? - posted by Tao Xiao <xi...@gmail.com> on 2014/02/20 16:13:15 UTC, 6 replies.
- accumulator duration - posted by Adrian Mocanu <am...@verticalscope.com> on 2014/02/20 17:30:42 UTC, 0 replies.
- saving RDD to disk - fault tolerance - posted by Adrian Mocanu <am...@verticalscope.com> on 2014/02/20 19:41:23 UTC, 1 replies.
- Explain About Logs NetworkWordcount.scala - posted by Eduardo Costa Alfaia <e....@unibs.it> on 2014/02/20 19:46:59 UTC, 2 replies.
- How to build spark-0.9.0 on a proxied system - posted by 9000revs <90...@gmail.com> on 2014/02/20 20:26:35 UTC, 1 replies.
- Running spark on cluster - posted by Mohit Singh <mo...@gmail.com> on 2014/02/20 22:43:05 UTC, 5 replies.
- file not found - posted by Mohit Singh <mo...@gmail.com> on 2014/02/21 01:25:58 UTC, 2 replies.
- multi-concurrent proccessing - posted by "Livni, Dana" <da...@intel.com> on 2014/02/21 04:30:33 UTC, 2 replies.
- Custom RDD gets HadoopSplit for compute() call - posted by Dmitriy Lyubimov <dl...@gmail.com> on 2014/02/21 04:49:37 UTC, 2 replies.
- Re: SparkContext startup time out - posted by yaoxin <ya...@gmail.com> on 2014/02/21 05:09:13 UTC, 1 replies.
- Can spark-streaming work with spark-on-yarn mode? - posted by 林武康 <vb...@gmail.com> on 2014/02/21 08:16:28 UTC, 1 replies.
- [incubating-0.9.0] Too Many Open Files on Workers - posted by andy petrella <an...@gmail.com> on 2014/02/21 10:32:04 UTC, 3 replies.
- Using PySpark for Streaming - posted by Prasanth Prahladan <pr...@gmail.com> on 2014/02/21 12:33:25 UTC, 3 replies.
- Spark HDFS read/write in local mode and cluster mode behaviour - posted by Jaggu <ja...@gmail.com> on 2014/02/21 13:07:10 UTC, 0 replies.
- spark/shark + cql3 - posted by Liam Stewart <li...@gmail.com> on 2014/02/21 19:05:32 UTC, 7 replies.
- log4j on pyspark - posted by Diana Carroll <dc...@cloudera.com> on 2014/02/21 20:04:57 UTC, 0 replies.
- OOM when calling cache on RDD with big data - posted by tdeng <td...@twitter.com> on 2014/02/21 20:56:15 UTC, 1 replies.
- Spark Summit 2014 - posted by Scott walent <sc...@gmail.com> on 2014/02/21 20:57:09 UTC, 0 replies.
- unable to deploy spark cluster with docker script - posted by Jaonary Rabarisoa <ja...@gmail.com> on 2014/02/21 21:17:57 UTC, 0 replies.
- UseCompressedStrings - posted by Fabrizio Milo aka misto <mi...@gmail.com> on 2014/02/22 01:40:38 UTC, 1 replies.
- spark compile time error: jarOfClass - posted by Ankur Saran <an...@buffalo.edu> on 2014/02/22 02:15:43 UTC, 1 replies.
- Spark-AMI version compatibility table - posted by "nicholas.chammas" <ni...@gmail.com> on 2014/02/22 05:43:34 UTC, 1 replies.
- Trying to connect to spark from within a web server - posted by Nathan Kronenfeld <nk...@oculusinfo.com> on 2014/02/22 06:36:46 UTC, 5 replies.
- Get RDD partition location - posted by Grega Kešpret <gr...@celtra.com> on 2014/02/22 21:11:41 UTC, 1 replies.
- How to sort an RDD ? - posted by Fabrizio Milo aka misto <mi...@gmail.com> on 2014/02/22 23:41:53 UTC, 5 replies.
- standalone spark app build.sbt compilation error - posted by hadoop user <us...@gmail.com> on 2014/02/22 23:54:16 UTC, 2 replies.
- programmatic way to tell Spark version - posted by "nicholas.chammas" <ni...@gmail.com> on 2014/02/23 01:04:46 UTC, 2 replies.
- Spark High Availability - posted by Matan Shukry <ma...@gmail.com> on 2014/02/23 01:12:26 UTC, 3 replies.
- Creating a Spark context from a Scalatra servlet - posted by Ognen Duzlevski <og...@nengoiksvelzud.com> on 2014/02/23 17:26:30 UTC, 7 replies.
- Set memory when using local[k] - posted by agg <ag...@gmail.com> on 2014/02/23 21:50:55 UTC, 3 replies.
- High CPU usage - posted by "Livni, Dana" <da...@intel.com> on 2014/02/23 21:57:04 UTC, 1 replies.
- Spark Quick Start - call to open README.md needs explicit fs prefix - posted by "nicholas.chammas" <ni...@gmail.com> on 2014/02/24 03:33:10 UTC, 2 replies.
- GraphX with UUID vertex IDs instead of Long - posted by Deepak Nulu <de...@gmail.com> on 2014/02/24 03:39:01 UTC, 14 replies.
- Disable all spark logging - posted by agg <ag...@gmail.com> on 2014/02/24 05:45:08 UTC, 3 replies.
- Having Spark read a JSON file - posted by "nicholas.chammas" <ni...@gmail.com> on 2014/02/24 06:10:02 UTC, 4 replies.
- Re: java.io.NotSerializableException - posted by "leosandylh@gmail.com" <le...@gmail.com> on 2014/02/24 12:14:58 UTC, 2 replies.
- Shark server crashes-[Thrift Error]: java.net.SocketException: Socket closed - posted by Arpit Tak <ar...@gmail.com> on 2014/02/24 13:08:20 UTC, 1 replies.
- java.lang.ClassNotFoundException - posted by Terance Dias <te...@gmail.com> on 2014/02/24 13:49:21 UTC, 0 replies.
- Nothing happens when executing on cluster - posted by Anders Bennehag <an...@tajitsu.com> on 2014/02/24 15:55:55 UTC, 1 replies.
- metrics.MetricsSystem: Sink class org.apache.spark.metrics.sink.MetricsServlet cannot be instantialized - posted by Grega Kešpret <gr...@celtra.com> on 2014/02/24 21:06:46 UTC, 0 replies.
- ETL on pyspark - posted by Chengi Liu <ch...@gmail.com> on 2014/02/24 22:08:24 UTC, 7 replies.
- cached rdd in memory eviction - posted by Koert Kuipers <ko...@tresata.com> on 2014/02/24 22:13:53 UTC, 0 replies.
- apparently non-critical errors running spark-ec2 launch - posted by "nicholas.chammas" <ni...@gmail.com> on 2014/02/25 02:26:53 UTC, 2 replies.
- How to get well-distribute partition - posted by zhaoxw12 <zh...@mails.tsinghua.edu.cn> on 2014/02/25 03:16:58 UTC, 4 replies.
- Is it necessary to call setID in SparkHadoopWriter.scala - posted by haosdent <ha...@gmail.com> on 2014/02/25 03:43:06 UTC, 0 replies.
- Filter on Date by comparing - posted by Soumya Simanta <so...@gmail.com> on 2014/02/25 03:57:08 UTC, 4 replies.
- Running GraphX example from Scala REPL - posted by Soumya Simanta <so...@gmail.com> on 2014/02/25 04:09:19 UTC, 0 replies.
- Re: java.io.NotSerializableException Of dependent Java lib. - posted by yaoxin <ya...@gmail.com> on 2014/02/25 04:11:05 UTC, 0 replies.
- spark failure - posted by Nathan Kronenfeld <nk...@oculusinfo.com> on 2014/02/25 06:21:05 UTC, 0 replies.
- 答复: Can spark-streaming work with spark-on-yarn mode? - posted by 林武康 <vb...@gmail.com> on 2014/02/25 06:51:05 UTC, 0 replies.
- Job initialization performance of Spark standalone mode vs YARN - posted by polkosity <po...@gmail.com> on 2014/02/25 07:22:00 UTC, 1 replies.
- WARNING: Spark lists moving to spark.apache.org domain name - posted by Matei Zaharia <ma...@gmail.com> on 2014/02/25 07:42:59 UTC, 0 replies.
- Spark performance optimization - posted by polkosity <po...@gmail.com> on 2014/02/25 07:43:41 UTC, 2 replies.
- HBase row count - posted by Soumitra Kumar <ku...@gmail.com> on 2014/02/25 08:15:09 UTC, 10 replies.
- Re: - posted by Eugen Cepoi <ce...@gmail.com> on 2014/02/25 10:17:08 UTC, 1 replies.
- Need some tutorials and examples about customized partitioner - posted by Tao Xiao <xi...@gmail.com> on 2014/02/25 10:19:32 UTC, 4 replies.
- Size of RDD larger than Size of data on disk - posted by Suraj Satishkumar Sheth <su...@adobe.com> on 2014/02/25 15:47:21 UTC, 3 replies.
- Spark in YARN HDP problem - posted by aecc <al...@gmail.com> on 2014/02/25 16:35:46 UTC, 3 replies.
- Kryo serialization does not compress - posted by pradeeps8 <sr...@gmail.com> on 2014/02/25 16:39:07 UTC, 0 replies.
- Sharing SparkContext - posted by abhinav chowdary <ab...@gmail.com> on 2014/02/25 18:59:55 UTC, 8 replies.
- Help with building and running examples with GraphX from the REPL - posted by Soumya Simanta <so...@gmail.com> on 2014/02/26 03:25:36 UTC, 0 replies.
- Re: [HELP] ask for some information about public data set - posted by "Evan R. Sparks" <ev...@gmail.com> on 2014/02/26 03:45:02 UTC, 0 replies.
- NullPointerException from 'Count' on DStream - posted by anoldbrain <an...@gmail.com> on 2014/02/26 03:52:08 UTC, 0 replies.
- Implementing a custom Spark shell - posted by Sampo Niskanen <sa...@wellmo.com> on 2014/02/26 08:44:36 UTC, 3 replies.
- Kyro Registration, class is not registered, but Log.TRACE() says otherwise - posted by pondwater <bi...@gmail.com> on 2014/02/26 15:51:04 UTC, 0 replies.
- Build Spark in IntelliJ IDEA 13 - posted by Yanzhe Chen <ya...@gmail.com> on 2014/02/26 17:59:00 UTC, 3 replies.
- Dealing with headers in csv file pyspark - posted by Chengi Liu <ch...@gmail.com> on 2014/02/26 18:28:06 UTC, 4 replies.
- window every n elements instead of time based - posted by Adrian Mocanu <am...@verticalscope.com> on 2014/02/26 18:34:52 UTC, 3 replies.
- specify output format using pyspark - posted by Chengi Liu <ch...@gmail.com> on 2014/02/26 18:43:38 UTC, 2 replies.
- Actors and sparkcontext actions - posted by Ognen Duzlevski <og...@nengoiksvelzud.com> on 2014/02/26 20:39:05 UTC, 0 replies.
- failed task running in a loop - posted by Vipul Pandey <vi...@gmail.com> on 2014/02/26 21:36:55 UTC, 0 replies.
- skipping ahead in RDD - posted by Adrian Mocanu <am...@verticalscope.com> on 2014/02/26 22:23:33 UTC, 2 replies.
- JVM error - posted by Mohit Singh <mo...@gmail.com> on 2014/02/26 23:39:46 UTC, 7 replies.
- worker keeps getting disassociated upon a failed job spark version 0.90 - posted by Shirish <sh...@gmail.com> on 2014/02/27 01:14:16 UTC, 0 replies.
- Messy GraphX merge/reduce functions - posted by Dan Davies <da...@parc.com> on 2014/02/27 07:49:55 UTC, 1 replies.
- Rename filter() into keep(), remove() or take() ? - posted by Bertrand Dechoux <de...@gmail.com> on 2014/02/27 13:36:53 UTC, 4 replies.
- is RDD failure transparent to stream consumer - posted by Adrian Mocanu <am...@verticalscope.com> on 2014/02/27 18:18:59 UTC, 3 replies.
- Spark streaming on ec2 - posted by Aureliano Buendia <bu...@gmail.com> on 2014/02/27 19:11:15 UTC, 11 replies.
- IncompatibleClassChangeError while running a spark program - posted by Usman Ghani <us...@platfora.com> on 2014/02/27 20:59:42 UTC, 2 replies.
- What's the difference between map and transform in spark streaming? - posted by Aureliano Buendia <bu...@gmail.com> on 2014/02/27 22:42:53 UTC, 2 replies.
- Re: Build Spark Against CDH5 - posted by Brian Brunner <br...@gmail.com> on 2014/02/28 00:46:49 UTC, 1 replies.
- Running Spark with Python 2.7.5+ - posted by "nicholas.chammas" <ni...@gmail.com> on 2014/02/28 00:48:42 UTC, 4 replies.
- Scalatra servlet with actors and SparkContext - posted by Ognen Duzlevski <og...@nengoiksvelzud.com> on 2014/02/28 02:22:20 UTC, 0 replies.
- Error reading HDFS file using spark 0.9.0 / hadoop 2.2.0 - incompatible protobuf 2.5 and 2.4.1 - posted by Prasad <ra...@gmail.com> on 2014/02/28 17:51:43 UTC, 8 replies.
- Key Sort order on reduction - posted by Usman Ghani <us...@platfora.com> on 2014/02/28 18:38:20 UTC, 0 replies.
- Use pyspark for following. - posted by Chengi Liu <ch...@gmail.com> on 2014/02/28 19:31:40 UTC, 1 replies.
- Spark stream example SimpleZeroMQPublisher high cpu usage - posted by Aureliano Buendia <bu...@gmail.com> on 2014/02/28 22:12:03 UTC, 0 replies.
- Re: Kryo Registration, class is not registered, but Log.TRACE() says otherwise - posted by pondwater <bi...@gmail.com> on 2014/02/28 22:50:21 UTC, 0 replies.