user@spark.apache.org, 2014-04

You are viewing a plain text version of this content. The canonical link for it is here.

- Re: Using ProtoBuf 2.5 for messages with Spark Streaming - posted by Patrick Wendell <pw...@gmail.com> on 2014/04/01 01:51:27 UTC, 12 replies.
- Re: batching the output - posted by Patrick Wendell <pw...@gmail.com> on 2014/04/01 01:54:16 UTC, 0 replies.
- Calling Spark enthusiasts in Austin, TX - posted by Ognen Duzlevski <og...@nengoiksvelzud.com> on 2014/04/01 02:57:00 UTC, 0 replies.
- Re: network wordcount example - posted by Chris Fregly <ch...@fregly.com> on 2014/04/01 03:13:03 UTC, 0 replies.
- Re: java.lang.ClassNotFoundException - spark on mesos - posted by Bharath Bhushan <ma...@outlook.com> on 2014/04/01 04:05:08 UTC, 2 replies.
- Re: Calling Spark enthusiasts in NYC - posted by giive chen <th...@gmail.com> on 2014/04/01 06:34:58 UTC, 1 replies.
- advanced training or implementation assistance - posted by "Livni, Dana" <da...@intel.com> on 2014/04/01 08:17:12 UTC, 0 replies.
- Hadoop LR comparison - posted by Tsai Li Ming <ma...@ltsai.com> on 2014/04/01 08:38:04 UTC, 2 replies.
- Re: Configuring distributed caching with Spark and YARN - posted by santhoma <sa...@yahoo.com> on 2014/04/01 09:33:02 UTC, 0 replies.
- SSH problem - posted by Sai Prasanna <an...@gmail.com> on 2014/04/01 12:02:25 UTC, 0 replies.
- Sliding Subwindows - posted by aecc <al...@gmail.com> on 2014/04/01 15:14:05 UTC, 0 replies.
- foreach not working - posted by eric perler <er...@hotmail.com> on 2014/04/01 16:08:50 UTC, 0 replies.
- Re: Unable to submit an application to standalone cluster which on hdfs. - posted by "haikal.pribadi" <ha...@gmail.com> on 2014/04/01 16:38:30 UTC, 0 replies.
- custom receiver in java - posted by eric perler <er...@hotmail.com> on 2014/04/01 16:54:43 UTC, 1 replies.
- Use combineByKey and StatCount - posted by Jaonary Rabarisoa <ja...@gmail.com> on 2014/04/01 16:55:10 UTC, 2 replies.
- Re: Is there a way to get the current progress of the job? - posted by Philip Ogren <ph...@oracle.com> on 2014/04/01 17:43:41 UTC, 8 replies.
- Mllib in pyspark for 0.8.1 - posted by Ian Ferreira <ia...@hotmail.com> on 2014/04/01 17:53:52 UTC, 1 replies.
- Re: Using ProtoBuf 2.5 for messages with Spark Streaming - posted by Vipul Pandey <vi...@gmail.com> on 2014/04/01 21:23:06 UTC, 0 replies.
- Re: possible bug in Spark's ALS implementation... - posted by Nick Pentreath <ni...@gmail.com> on 2014/04/01 22:15:24 UTC, 4 replies.
- Re: Best practices: Parallelized write to / read from S3 - posted by Nicholas Chammas <ni...@gmail.com> on 2014/04/01 22:51:10 UTC, 2 replies.
- Protobuf 2.5 Mesos - posted by Ian Ferreira <ia...@hotmail.com> on 2014/04/01 22:51:40 UTC, 1 replies.
- Generic types and pair RDDs - posted by Daniel Siegmann <da...@velos.io> on 2014/04/01 22:55:38 UTC, 3 replies.
- PySpark RDD.partitionBy() requires an RDD of tuples - posted by Nicholas Chammas <ni...@gmail.com> on 2014/04/02 00:01:07 UTC, 6 replies.
- Cannot Access Web UI - posted by yxzhao <yx...@ualr.edu> on 2014/04/02 00:30:48 UTC, 4 replies.
- Issue with zip and partitions - posted by Pa...@Dell.com on 2014/04/02 04:27:01 UTC, 2 replies.
- Status of MLI? - posted by Krakna H <sh...@gmail.com> on 2014/04/02 04:38:49 UTC, 8 replies.
- Re: How to index each map operation???? - posted by Thierry Herrmann <th...@gmail.com> on 2014/04/02 05:10:59 UTC, 2 replies.
- CDH5 Spark on EC2 - posted by Denny Lee <de...@gmail.com> on 2014/04/02 09:44:17 UTC, 2 replies.
- Re: Calling Spahk enthusiasts in Boston - posted by Pierre Borckmans <pi...@realimpactanalytics.com> on 2014/04/02 10:32:49 UTC, 1 replies.
- java.lang.NoClassDefFoundError: scala/tools/nsc/transform/UnCurry$UnCurryTransformer... - posted by "Francis.Hu" <fr...@reachjunction.com> on 2014/04/02 12:26:53 UTC, 2 replies.
- ActorNotFound problem for mesos driver - posted by Leon Zhang <le...@gmail.com> on 2014/04/02 17:30:51 UTC, 3 replies.
- Re: spark-streaming - posted by Nathan Kronenfeld <nk...@oculusinfo.com> on 2014/04/02 18:23:33 UTC, 0 replies.
- Print line in JavaNetworkWordCount - posted by Eduardo Costa Alfaia <e....@unibs.it> on 2014/04/02 20:26:55 UTC, 0 replies.
- Resilient nature of RDD - posted by David Thomas <dt...@gmail.com> on 2014/04/02 20:27:55 UTC, 3 replies.
- Spark output compression on HDFS - posted by Kostiantyn Kudriavtsev <ku...@gmail.com> on 2014/04/02 21:18:28 UTC, 9 replies.
- Need suggestions - posted by yh18190 <yh...@gmail.com> on 2014/04/02 22:01:49 UTC, 6 replies.
- Optimal Server Design for Spark - posted by Stephen Watt <sw...@redhat.com> on 2014/04/02 23:58:43 UTC, 6 replies.
- Measure the Total Network I/O, Cpu and Memory Consumed by Spark Job - posted by yxzhao <yx...@ualr.edu> on 2014/04/03 00:40:16 UTC, 2 replies.
- Regarding Sparkcontext object - posted by yh18190 <yh...@gmail.com> on 2014/04/03 01:11:32 UTC, 1 replies.
- Efficient way to aggregate event data at daily/weekly/monthly level - posted by K Koh <de...@gmail.com> on 2014/04/03 02:22:12 UTC, 1 replies.
- How to ask questions on Spark usage? - posted by weida xu <xw...@gmail.com> on 2014/04/03 04:36:07 UTC, 1 replies.
- Spark RDD to Shark table IN MEMORY conversion - posted by abhietc31 <ab...@gmail.com> on 2014/04/03 04:46:58 UTC, 3 replies.
- Shark Direct insert into table value (?) - posted by abhietc31 <ab...@gmail.com> on 2014/04/03 04:52:46 UTC, 2 replies.
- Submitting to yarn cluster - posted by Ron Gonzalez <zl...@yahoo.com> on 2014/04/03 05:10:12 UTC, 4 replies.
- Error when run Spark on mesos - posted by felix <cn...@gmail.com> on 2014/04/03 05:35:54 UTC, 7 replies.
- Re: Spark streaming kafka _output_ - posted by Soren Macbeth <so...@yieldbot.com> on 2014/04/03 05:53:53 UTC, 2 replies.
- Example of creating expressions for SchemaRDD methods - posted by All In A Days Work <al...@gmail.com> on 2014/04/03 06:52:40 UTC, 4 replies.
- Re: Guidelines for Spark Cluster Sizing - posted by Sonal Goyal <so...@gmail.com> on 2014/04/03 10:06:25 UTC, 2 replies.
- shark failed when launching - posted by "wangyi@testbird.com" <wa...@testbird.com> on 2014/04/03 11:30:38 UTC, 0 replies.
- How to use addJar for adding external jars in spark-0.9? - posted by yh18190 <yh...@gmail.com> on 2014/04/03 12:36:23 UTC, 1 replies.
- How to stop system info output in spark shell - posted by weida xu <xw...@gmail.com> on 2014/04/03 13:46:47 UTC, 3 replies.
- Spark Disk Usage - posted by Surendranauth Hiraman <su...@velos.io> on 2014/04/03 14:27:36 UTC, 8 replies.
- Re: Strange behavior of RDD.cartesian - posted by Jaonary Rabarisoa <ja...@gmail.com> on 2014/04/03 14:44:54 UTC, 0 replies.
- what does SPARK_EXECUTOR_URI in spark-env.sh do ? - posted by felix <cn...@gmail.com> on 2014/04/03 15:05:22 UTC, 3 replies.
- Avro serialization - posted by Ron Gonzalez <zl...@yahoo.com> on 2014/04/03 16:11:56 UTC, 3 replies.
- Spark 1.0.0 release plan - posted by Bhaskar Dutta <bh...@gmail.com> on 2014/04/03 19:30:59 UTC, 4 replies.
- Spark SQL transformations, narrow vs. wide - posted by Jan-Paul Bultmann <ja...@me.com> on 2014/04/03 21:59:51 UTC, 4 replies.
- Re: pySpark memory usage - posted by Matei Zaharia <ma...@gmail.com> on 2014/04/04 02:37:09 UTC, 5 replies.
- Sample Project for using Shark API in Spark programs - posted by Jerry Lam <ch...@gmail.com> on 2014/04/04 03:24:38 UTC, 2 replies.
- 答复: java.lang.NoClassDefFoundError: scala/tools/nsc/transform/UnCurry$UnCurryTransformer... - posted by "Francis.Hu" <fr...@reachjunction.com> on 2014/04/04 08:49:48 UTC, 1 replies.
- How to create a RPM package - posted by Rahul Singhal <Ra...@guavus.com> on 2014/04/04 09:10:11 UTC, 6 replies.
- module not found: org.eclipse.paho#mqtt-client;0.4.0 - posted by Dear all <al...@163.com> on 2014/04/04 13:21:25 UTC, 1 replies.
- Explain Add Input - posted by Eduardo Costa Alfaia <e....@unibs.it> on 2014/04/04 14:33:55 UTC, 0 replies.
- RAM high consume - posted by Eduardo Costa Alfaia <e....@unibs.it> on 2014/04/04 14:53:17 UTC, 0 replies.
- how to save RDD partitions in different folders? - posted by dmpour23 <dm...@gmail.com> on 2014/04/04 15:01:01 UTC, 4 replies.
- How to start history tracking URL - posted by zhxfl <29...@qq.com> on 2014/04/04 15:10:17 UTC, 0 replies.
- Driver increase memory utilization - posted by Eduardo Costa Alfaia <e....@unibs.it> on 2014/04/04 16:26:25 UTC, 0 replies.
- Hadoop 2.X Spark Client Jar 0.9.0 problem - posted by Erik Freed <er...@codecision.com> on 2014/04/04 16:28:19 UTC, 3 replies.
- reduceByKeyAndWindow Java - posted by Eduardo Costa Alfaia <e....@unibs.it> on 2014/04/04 17:03:43 UTC, 3 replies.
- Re: Job initialization performance of Spark standalone mode vs YARN - posted by Ron Gonzalez <zl...@yahoo.com> on 2014/04/04 17:07:40 UTC, 0 replies.
- Parallelism level - posted by Eduardo Costa Alfaia <e....@unibs.it> on 2014/04/04 17:41:07 UTC, 4 replies.
- Re: Error reading HDFS file using spark 0.9.0 / hadoop 2.2.0 - incompatible protobuf 2.5 and 2.4.1 - posted by Prasad <ra...@gmail.com> on 2014/04/04 17:57:03 UTC, 4 replies.
- RAM Increase - posted by Eduardo Costa Alfaia <e....@unibs.it> on 2014/04/04 18:24:47 UTC, 0 replies.
- How are exceptions in map functions handled in Spark? - posted by John Salvatier <js...@gmail.com> on 2014/04/04 19:40:31 UTC, 5 replies.
- Largest Spark Cluster - posted by Parviz Deyhim <pd...@gmail.com> on 2014/04/04 21:05:14 UTC, 1 replies.
- Having spark-ec2 join new slaves to existing cluster - posted by Nicholas Chammas <ni...@gmail.com> on 2014/04/04 21:16:58 UTC, 4 replies.
- Re: example of non-line oriented input data? - posted by Matei Zaharia <ma...@gmail.com> on 2014/04/04 21:30:35 UTC, 0 replies.
- Spark on other parallel filesystems - posted by Venkat Krishnamurthy <ve...@yarcdata.com> on 2014/04/05 01:56:49 UTC, 6 replies.
- Heartbeat exceeds - posted by Debasish Das <de...@gmail.com> on 2014/04/05 02:52:36 UTC, 6 replies.
- exactly once - posted by Bharath Bhushan <ma...@outlook.com> on 2014/04/05 06:35:29 UTC, 0 replies.
- Redirect Incubator pages - posted by Andrew Ash <an...@andrewash.com> on 2014/04/05 08:19:00 UTC, 1 replies.
- Spark Worker in different machine doesnt work - posted by subacini Arunkumar <su...@gmail.com> on 2014/04/06 20:44:36 UTC, 0 replies.
- Re: any work around to support nesting of RDDs other than join - posted by nkd <ka...@gmail.com> on 2014/04/06 22:05:35 UTC, 0 replies.
- hang caused by memory threshold? - posted by Stuart Zakon <sz...@objectsbydesign.com> on 2014/04/07 02:06:18 UTC, 0 replies.
- hang on sorting operation - posted by Stuart Zakon <sj...@yahoo.com> on 2014/04/07 11:52:00 UTC, 0 replies.
- Recommended way to develop spark application with both java and python - posted by Wush Wu <wu...@bridgewell.com> on 2014/04/07 11:58:13 UTC, 0 replies.
- Require some clarity on partitioning - posted by Sanjay Awatramani <sa...@yahoo.com> on 2014/04/07 15:17:24 UTC, 0 replies.
- Null Pointer Exception in Spark Application with Yarn Client Mode - posted by Sai Prasanna <an...@gmail.com> on 2014/04/07 15:26:56 UTC, 2 replies.
- non-lazy execution of sortByKey? - posted by Diana Carroll <dc...@cloudera.com> on 2014/04/07 17:42:00 UTC, 3 replies.
- PySpark SocketConnect Issue in Cluster - posted by Surendranauth Hiraman <su...@velos.io> on 2014/04/07 19:10:12 UTC, 1 replies.
- AWS Spark-ec2 script with different user - posted by Marco Costantini <si...@granatads.com> on 2014/04/07 20:14:45 UTC, 13 replies.
- SparkContext.addFile() and FileNotFoundException - posted by Thierry Herrmann <th...@gmail.com> on 2014/04/07 20:35:06 UTC, 0 replies.
- ui broken in latest 1.0.0 - posted by Koert Kuipers <ko...@tresata.com> on 2014/04/07 22:06:00 UTC, 17 replies.
- Creating a SparkR standalone job - posted by pawan kumar <pk...@gmail.com> on 2014/04/07 23:21:54 UTC, 4 replies.
- Driver Out of Memory - posted by Eduardo Costa Alfaia <e....@unibs.it> on 2014/04/08 00:12:44 UTC, 1 replies.
- CheckpointRDD has different number of partitions than original RDD - posted by Paul Mogren <PM...@commercehub.com> on 2014/04/08 00:48:33 UTC, 3 replies.
- job offering - posted by "Rault, Severan" <se...@amazon.com> on 2014/04/08 01:49:46 UTC, 0 replies.
- RDDInfo visibility SPARK-1132 - posted by Koert Kuipers <ko...@tresata.com> on 2014/04/08 02:05:55 UTC, 2 replies.
- trouble with "join" on large RDDs - posted by Brad Miller <bm...@eecs.berkeley.edu> on 2014/04/08 04:37:10 UTC, 6 replies.
- [BLOG] For Beginners - posted by prabeesh k <pr...@gmail.com> on 2014/04/08 06:54:07 UTC, 1 replies.
- Mongo-Hadoop Connector with Spark - posted by Pavan Kumar <kp...@gmail.com> on 2014/04/08 07:42:04 UTC, 0 replies.
- How to execute a function from class in distributed jar on each worker node? - posted by Adnan <ns...@gmail.com> on 2014/04/08 12:05:12 UTC, 1 replies.
- Only TraversableOnce? - posted by wxhsdp <wx...@gmail.com> on 2014/04/08 14:09:57 UTC, 6 replies.
- Re: spark-shell on standalone cluster gives error " no mesos in java.library.path" - posted by Christoph Böhm <li...@gmx.net> on 2014/04/08 15:06:52 UTC, 0 replies.
- NPE using saveAsTextFile - posted by Nick Pentreath <ni...@gmail.com> on 2014/04/08 16:50:27 UTC, 4 replies.
- Urgently need help interpreting duration - posted by Yana Kadiyska <ya...@gmail.com> on 2014/04/08 17:11:55 UTC, 3 replies.
- Spark and HBase - posted by Flavio Pompermaier <po...@okkam.it> on 2014/04/08 17:57:53 UTC, 9 replies.
- RDD creation on HDFS - posted by gtanguy <g....@gmail.com> on 2014/04/08 18:40:22 UTC, 0 replies.
- assumption that lib_managed is present - posted by Koert Kuipers <ko...@tresata.com> on 2014/04/08 18:54:18 UTC, 1 replies.
- Re: Pig on Spark - posted by Mayur Rustagi <ma...@gmail.com> on 2014/04/08 19:47:11 UTC, 15 replies.
- java.io.IOException: Call to dev/17.29.25.4:50070 failed on local exception: java.io.EOFException - posted by reegs <re...@gmail.com> on 2014/04/08 20:35:14 UTC, 1 replies.
- Why doesn't the driver node do any work? - posted by Nicholas Chammas <ni...@gmail.com> on 2014/04/08 21:24:11 UTC, 5 replies.
- Measuring Network Traffic for Spark Job - posted by yxzhao <yx...@ualr.edu> on 2014/04/08 21:57:42 UTC, 2 replies.
- ETL for postgres to hadoop - posted by Manas Kar <Ma...@exactearth.com> on 2014/04/08 22:00:55 UTC, 1 replies.
- Spark with SSL? - posted by kamatsuoka <ke...@gmail.com> on 2014/04/08 22:14:28 UTC, 5 replies.
- Preconditions on RDDs for creating a Graph? - posted by Adam Novak <an...@soe.ucsc.edu> on 2014/04/08 23:33:53 UTC, 1 replies.
- A series of meetups about machine learning with Spark in San Francisco - posted by DB Tsai <db...@stanford.edu> on 2014/04/09 01:06:13 UTC, 0 replies.
- issue of driver's HA - posted by 林武康 <vb...@gmail.com> on 2014/04/09 06:07:32 UTC, 0 replies.
- java.io.NotSerializableException exception - custom Accumulator - posted by Dhimant Jayswal <dh...@gmail.com> on 2014/04/09 07:07:39 UTC, 1 replies.
- Error when compiling spark in IDEA and best practice to use IDE? - posted by Dong Mo <mo...@gmail.com> on 2014/04/09 07:14:24 UTC, 4 replies.
- KafkaReciever Error when starting ssc (Actor name not unique) - posted by gaganbm <ga...@gmail.com> on 2014/04/09 08:59:14 UTC, 0 replies.
- Spark on YARN performance - posted by Flavio Pompermaier <po...@okkam.it> on 2014/04/09 09:10:19 UTC, 6 replies.
- Spark operators on Objects - posted by Flavio Pompermaier <po...@okkam.it> on 2014/04/09 09:19:26 UTC, 2 replies.
- Spark packaging - posted by Pradeep baji <pr...@gmail.com> on 2014/04/09 09:34:05 UTC, 3 replies.
- RDD top method exception - posted by mailforledkk <ma...@126.com> on 2014/04/09 13:28:54 UTC, 0 replies.
- To Ten RDD - posted by "Jeyaraj, Arockia R (Arockia)" <ar...@verizon.com> on 2014/04/09 15:55:47 UTC, 2 replies.
- What level of parallelism should I expect from my cluster? - posted by Nicholas Chammas <ni...@gmail.com> on 2014/04/09 17:34:35 UTC, 0 replies.
- executors not registering with the driver - posted by azurecoder <ri...@elastacloud.com> on 2014/04/09 17:38:24 UTC, 0 replies.
- hbase scan performance - posted by David Quigley <dq...@gmail.com> on 2014/04/09 18:02:14 UTC, 2 replies.
- How does Spark handle RDD via HDFS ? - posted by gtanguy <g....@gmail.com> on 2014/04/09 18:05:52 UTC, 0 replies.
- Re: How does Spark handle RDD via HDFS ? - posted by Andrew Ash <an...@andrewash.com> on 2014/04/09 18:39:27 UTC, 1 replies.
- How to change the parallelism level of input dstreams - posted by Dong Mo <mo...@gmail.com> on 2014/04/09 18:47:41 UTC, 0 replies.
- cannot run spark shell in yarn-client mode - posted by "Pennacchiotti, Marco" <mp...@ebay.com> on 2014/04/09 19:42:00 UTC, 0 replies.
- KafkaInputDStream Stops reading new messages - posted by Kanwaldeep <ka...@gmail.com> on 2014/04/09 22:57:33 UTC, 0 replies.
- is it possible to initiate Spark jobs from Oozie? - posted by "Segerlind, Nathan L" <na...@intel.com> on 2014/04/09 23:10:56 UTC, 2 replies.
- Spark 0.9.1 released - posted by Tathagata Das <ta...@gmail.com> on 2014/04/09 23:54:03 UTC, 5 replies.
- Problem with running LogisticRegression in spark cluster mode - posted by Jenny Zhao <li...@gmail.com> on 2014/04/10 00:05:21 UTC, 2 replies.
- Multi master Spark - posted by Pradeep Ch <pr...@gmail.com> on 2014/04/10 00:08:24 UTC, 5 replies.
- Re: programmatic way to tell Spark version - posted by Nicholas Chammas <ni...@gmail.com> on 2014/04/10 00:30:12 UTC, 5 replies.
- Best way to turn an RDD back into a SchemaRDD - posted by Jan-Paul Bultmann <ja...@me.com> on 2014/04/10 01:05:48 UTC, 1 replies.
- shuffle memory requirements - posted by Ameet Kini <am...@gmail.com> on 2014/04/10 04:48:06 UTC, 2 replies.
- Strange behaviour of different SSCs with same Kafka topic - posted by gaganbm <ga...@gmail.com> on 2014/04/10 08:24:12 UTC, 5 replies.
- Re: Where does println output go? - posted by wxhsdp <wx...@gmail.com> on 2014/04/10 11:20:05 UTC, 0 replies.
- Re: Shark CDH5 Final Release - posted by chutium <te...@gmail.com> on 2014/04/10 11:36:31 UTC, 0 replies.
- 回复: RDD top method exception - posted by mailforledkk <ma...@126.com> on 2014/04/10 13:20:43 UTC, 0 replies.
- Executing spark jobs with predefined Hadoop user - posted by Asaf Lahav <as...@gmail.com> on 2014/04/10 14:14:30 UTC, 4 replies.
- Fwd: Spark - ready for prime time? - posted by Andras Nemeth <an...@lynxanalytics.com> on 2014/04/10 16:11:59 UTC, 20 replies.
- Spark 0.9.1 PySpark ImportError - posted by aazout <al...@velos.io> on 2014/04/10 16:39:16 UTC, 2 replies.
- Behaviour of caching when dataset does not fit into memory - posted by Pierre Borckmans <pi...@realimpactanalytics.com> on 2014/04/10 18:07:14 UTC, 4 replies.
- /bin/java not found: JAVA_HOME ignored launching shark executor - posted by Ken Ellinwood <ke...@yahoo.com> on 2014/04/10 21:02:43 UTC, 1 replies.
- Using pyspark shell in local[n] (single machine) mode unnecessarily tries to connect to HDFS NameNode ... - posted by DiData <su...@didata.us> on 2014/04/10 21:30:32 UTC, 4 replies.
- Error specifying Kafka params from Java - posted by Paul Mogren <PM...@commercehub.com> on 2014/04/10 22:55:17 UTC, 1 replies.
- SparkR with Sequence Files - posted by Gary Malouf <ma...@gmail.com> on 2014/04/11 00:37:10 UTC, 1 replies.
- Is Branch 1.0 build broken ? - posted by Chester Chen <ch...@yahoo.com> on 2014/04/11 01:29:27 UTC, 0 replies.
- Re: Is Branch 1.0 build broken ? - posted by Sean Owen <so...@cloudera.com> on 2014/04/11 08:33:41 UTC, 1 replies.
- Error when I use spark-streaming - posted by Hahn Jiang <ha...@gmail.com> on 2014/04/11 09:55:32 UTC, 2 replies.
- [GraphX] Cast error when comparing a vertex attribute after its type has changed - posted by Pierre-Alexandre Fonta <pi...@gmail.com> on 2014/04/11 13:42:56 UTC, 1 replies.
- Hybrid GPU CPU computation - posted by Jaonary Rabarisoa <ja...@gmail.com> on 2014/04/11 14:38:12 UTC, 6 replies.
- Using Spark for Divide-and-Conquer Algorithms - posted by Yanzhe Chen <ya...@gmail.com> on 2014/04/11 14:52:35 UTC, 1 replies.
- Too many tasks in reduceByKey() when do PageRank iteration - posted by 张志齐 <go...@126.com> on 2014/04/11 15:32:26 UTC, 0 replies.
- GraphX - posted by Ghufran Malik <go...@gmail.com> on 2014/04/11 18:46:16 UTC, 0 replies.
- Spark behaviour when executor JVM crashes - posted by "deenar.toraskar" <de...@db.com> on 2014/04/11 19:34:44 UTC, 0 replies.
- Re: Setting properties in core-site.xml for Spark and Hadoop to access - posted by Nicholas Chammas <ni...@gmail.com> on 2014/04/11 20:47:29 UTC, 0 replies.
- 0.9 wont start cluster on ec2, SSH connection refused? - posted by Alton Alexander <al...@gmail.com> on 2014/04/11 21:37:08 UTC, 2 replies.
- Shutdown with streaming driver running in cluster broke master web UI permanently - posted by Paul Mogren <PM...@commercehub.com> on 2014/04/11 22:14:55 UTC, 0 replies.
- Huge matrix - posted by Xiaoli Li <li...@gmail.com> on 2014/04/11 22:54:40 UTC, 10 replies.
- SVD under spark/mllib/linalg - posted by wxhsdp <wx...@gmail.com> on 2014/04/12 03:12:45 UTC, 1 replies.
- Compile SimpleApp.scala encountered error, please can any one help? - posted by jni2000 <ja...@federatedwireless.com> on 2014/04/12 07:37:57 UTC, 3 replies.
- cannot exec. job: "TaskSchedulerImpl: Initial job has not accepted any resources" - posted by Gerd Koenig <ko...@googlemail.com> on 2014/04/12 15:12:43 UTC, 2 replies.
- Master registers itself at startup? - posted by ge ko <ko...@gmail.com> on 2014/04/12 15:19:23 UTC, 7 replies.
- Re: Changing number of workers for benchmarking purposes - posted by Kalpit Shah <sh...@gmail.com> on 2014/04/12 18:31:35 UTC, 0 replies.
- what is the difference between persist() and cache()? - posted by Joe L <se...@yahoo.com> on 2014/04/13 16:26:00 UTC, 1 replies.
- how to use a single filter instead of multiple filters - posted by Joe L <se...@yahoo.com> on 2014/04/13 18:39:31 UTC, 0 replies.
- Re: function state lost when next RDD is processed - posted by Chris Fregly <ch...@fregly.com> on 2014/04/13 20:14:57 UTC, 0 replies.
- (Unknown) - posted by ge ko <ko...@gmail.com> on 2014/04/13 21:51:06 UTC, 2 replies.
- Checkpoint Vs Cache - posted by David Thomas <dt...@gmail.com> on 2014/04/14 04:20:52 UTC, 2 replies.
- how to count maps without shuffling too much data? - posted by Joe L <se...@yahoo.com> on 2014/04/14 04:53:44 UTC, 0 replies.
- How to set spark worker memory size? - posted by Joe L <se...@yahoo.com> on 2014/04/14 05:02:14 UTC, 0 replies.
- how to count maps within a node? - posted by Joe L <se...@yahoo.com> on 2014/04/14 05:58:40 UTC, 0 replies.
- moving SparkContext around - posted by "Schein, Sagi" <sa...@hp.com> on 2014/04/14 06:40:09 UTC, 0 replies.
- Incredible slow iterative computation - posted by Andrea Esposito <an...@gmail.com> on 2014/04/14 12:45:39 UTC, 1 replies.
- Proper caching method - posted by Joe L <se...@yahoo.com> on 2014/04/14 14:32:05 UTC, 5 replies.
- Lost an executor error - Jobs fail - posted by Praveen R <pr...@sigmoidanalytics.com> on 2014/04/14 15:29:20 UTC, 5 replies.
- process_local vs node_local - posted by Nathan Kronenfeld <nk...@oculusinfo.com> on 2014/04/14 16:13:30 UTC, 3 replies.
- RDD.tail() - posted by Philip Ogren <ph...@oracle.com> on 2014/04/14 17:24:15 UTC, 2 replies.
- reduceByKey issue in example wordcount (scala) - posted by Ian Bonnycastle <ib...@gmail.com> on 2014/04/14 18:17:08 UTC, 5 replies.
- RDD collect help - posted by Flavio Pompermaier <po...@okkam.it> on 2014/04/14 18:21:35 UTC, 12 replies.
- Re: Spark resilience - posted by Aaron Davidson <il...@gmail.com> on 2014/04/14 19:30:00 UTC, 7 replies.
- using Kryo with pyspark? - posted by Diana Carroll <dc...@cloudera.com> on 2014/04/14 21:24:41 UTC, 1 replies.
- can't sc.paralellize in Spark 0.7.3 spark-shell - posted by Walrus theCat <wa...@gmail.com> on 2014/04/14 23:55:30 UTC, 4 replies.
- Pyspark with Cython - posted by Ian Ferreira <ia...@hotmail.com> on 2014/04/15 00:51:27 UTC, 0 replies.
- AmpCamp exercise in a local environment - posted by Nabeel Memon <nm...@gmail.com> on 2014/04/15 00:58:16 UTC, 3 replies.
- How to cogroup/join pair RDDs with different key types? - posted by Roger Hoover <ro...@gmail.com> on 2014/04/15 02:07:05 UTC, 6 replies.
- Scala vs Python performance differences - posted by Andrew Ash <an...@andrewash.com> on 2014/04/15 02:48:55 UTC, 4 replies.
- storage.MemoryStore estimated size 7 times larger than real - posted by wxhsdp <wx...@gmail.com> on 2014/04/15 04:07:24 UTC, 6 replies.
- shuffle vs performance - posted by Joe L <se...@yahoo.com> on 2014/04/15 05:47:48 UTC, 0 replies.
- Unsubscribe - posted by Chhaya Vishwakarma <Ch...@lntinfotech.com> on 2014/04/15 05:57:57 UTC, 0 replies.
- groupByKey returns a single partition in a RDD? - posted by Joe L <se...@yahoo.com> on 2014/04/15 09:22:05 UTC, 1 replies.
- Re: Comparing GraphX and GraphLab - posted by Qi Song <so...@gmail.com> on 2014/04/15 10:37:25 UTC, 0 replies.
- Spark program thows OutOfMemoryError - posted by Qin Wei <we...@dewmobile.net> on 2014/04/15 12:10:04 UTC, 3 replies.
- standalone vs YARN - posted by ishaaq <is...@gmail.com> on 2014/04/15 14:28:03 UTC, 2 replies.
- partitioning of small data sets - posted by Diana Carroll <dc...@cloudera.com> on 2014/04/15 17:44:51 UTC, 4 replies.
- Streaming job having Cassandra query : OutOfMemoryError - posted by sonyjv <so...@yahoo.com> on 2014/04/15 19:03:51 UTC, 0 replies.
- Why these operations are slower than the equivalent on Hadoop? - posted by Yanzhe Chen <ya...@gmail.com> on 2014/04/15 19:03:57 UTC, 10 replies.
- scheduler question - posted by Mohit Jaggi <mo...@gmail.com> on 2014/04/15 19:31:18 UTC, 0 replies.
- Can't run a simple spark application with 0.9.1 - posted by Paul Schooss <pa...@gmail.com> on 2014/04/15 20:21:53 UTC, 1 replies.
- Shark: class java.io.IOException: Cannot run program "/bin/java" - posted by ge ko <ko...@gmail.com> on 2014/04/15 21:54:32 UTC, 2 replies.
- Problem with KryoSerializer - posted by yh18190 <yh...@gmail.com> on 2014/04/15 22:47:03 UTC, 0 replies.
- StackOverflow Error when run ALS with 100 iterations - posted by Xiaoli Li <li...@gmail.com> on 2014/04/15 23:29:38 UTC, 3 replies.
- Multi-tenant? - posted by Ian Ferreira <ia...@hotmail.com> on 2014/04/16 01:08:05 UTC, 2 replies.
- java.net.SocketException: Network is unreachable while connecting to HBase - posted by amit karmakar <am...@gmail.com> on 2014/04/16 01:09:35 UTC, 1 replies.
- JMX with Spark - posted by Paul Schooss <pa...@gmail.com> on 2014/04/16 03:08:35 UTC, 4 replies.
- what is the difference between element and partition? - posted by Joe L <se...@yahoo.com> on 2014/04/16 06:53:01 UTC, 1 replies.
- groupByKey(None) returns partitions according to the keys? - posted by Joe L <se...@yahoo.com> on 2014/04/16 06:58:37 UTC, 1 replies.
- Could I improve Spark performance partitioning elements in a RDD? - posted by Joe L <se...@yahoo.com> on 2014/04/16 08:30:18 UTC, 0 replies.
- what is a partition? how it works? - posted by Joe L <se...@yahoo.com> on 2014/04/16 09:29:41 UTC, 0 replies.
- Java heap space and spark.akka.frameSize Inbox x - posted by Chieh-Yen <r0...@csie.ntu.edu.tw> on 2014/04/16 11:18:00 UTC, 3 replies.
- PySpark still reading only text? - posted by Bertrand Dechoux <de...@gmail.com> on 2014/04/16 13:27:10 UTC, 7 replies.
- using saveAsNewAPIHadoopFile with OrcOutputFormat - posted by Brock Bose <br...@gmail.com> on 2014/04/16 15:51:11 UTC, 2 replies.
- Create cache fails on first time - posted by Arpit Tak <ar...@sigmoidanalytics.com> on 2014/04/16 16:44:42 UTC, 1 replies.
- graph.reverse & Pregel API - posted by Bogdan Ghidireac <gh...@gmail.com> on 2014/04/16 16:51:16 UTC, 2 replies.
- SPARK_YARN_APP_JAR, SPARK_CLASSPATH and ADD_JARS in a spark-shell on YARN - posted by Christophe Préaud <ch...@kelkoo.com> on 2014/04/16 18:27:46 UTC, 4 replies.
- Using google cloud storage for spark big data - posted by Aureliano Buendia <bu...@gmail.com> on 2014/04/16 19:59:29 UTC, 5 replies.
- sbt assembly error - posted by Yiou Li <li...@gmail.com> on 2014/04/16 21:03:30 UTC, 5 replies.
- Regarding Partitioner - posted by yh18190 <yh...@gmail.com> on 2014/04/16 23:38:52 UTC, 0 replies.
- Re: GC overhead limit exceeded - posted by Nicholas Chammas <ni...@gmail.com> on 2014/04/16 23:43:07 UTC, 2 replies.
- choose the number of partition according to the number of nodes - posted by Joe L <se...@yahoo.com> on 2014/04/17 01:50:50 UTC, 2 replies.
- Random Forest on Spark - posted by Laeeq Ahmed <la...@yahoo.com> on 2014/04/17 11:42:07 UTC, 27 replies.
- Shark: ClassNotFoundException org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat - posted by ge ko <ko...@gmail.com> on 2014/04/17 11:55:45 UTC, 3 replies.
- Spark on Yarn or Mesos? - posted by Wei Wang <xw...@gmail.com> on 2014/04/17 12:12:15 UTC, 4 replies.
- Spark Example Project, runnable on EMR, open sourced - posted by Alex Dean <al...@snowplowanalytics.com> on 2014/04/17 16:10:58 UTC, 2 replies.
- strange StreamCorruptedException - posted by Lukas Nalezenec <lu...@firma.seznam.cz> on 2014/04/17 16:39:23 UTC, 0 replies.
- confused by reduceByKey usage - posted by 诺铁 <no...@gmail.com> on 2014/04/17 18:29:18 UTC, 6 replies.
- Continuously running non-streaming jobs - posted by Jim Carroll <ji...@gmail.com> on 2014/04/17 20:02:54 UTC, 3 replies.
- Spark 0.9.1 core dumps on Mesos 0.18.0 - posted by Steven Cox <sc...@renci.org> on 2014/04/17 20:29:43 UTC, 7 replies.
- writing booleans w Calliope - posted by Adrian Mocanu <am...@verticalscope.com> on 2014/04/17 22:11:25 UTC, 1 replies.
- Valid spark streaming use case? - posted by xargsgrep <ah...@gmail.com> on 2014/04/17 22:24:52 UTC, 4 replies.
- Re: distinct on huge dataset - posted by Ryan Compton <co...@gmail.com> on 2014/04/18 00:20:19 UTC, 2 replies.
- Ooyala Server - plans to merge it into Apache ? - posted by All In A Days Work <al...@gmail.com> on 2014/04/18 06:12:38 UTC, 3 replies.
- join with inputs co-partitioned? - posted by Joe L <se...@yahoo.com> on 2014/04/18 07:40:20 UTC, 0 replies.
- SPARK Shell RDD reuse - posted by Sai Prasanna <an...@gmail.com> on 2014/04/18 09:07:20 UTC, 0 replies.
- how to split one big RDD (about 100G) into several small ones? - posted by Joe L <se...@yahoo.com> on 2014/04/18 13:58:22 UTC, 0 replies.
- sc.makeRDD bug with NumericRange - posted by Aureliano Buendia <bu...@gmail.com> on 2014/04/18 19:25:48 UTC, 4 replies.
- Anyone using value classes in RDDs? - posted by kamatsuoka <ke...@gmail.com> on 2014/04/18 20:51:23 UTC, 5 replies.
- Thrift client to write directly to rdd - posted by bhardwaj_rajesh <bh...@hotmail.com> on 2014/04/18 23:20:51 UTC, 0 replies.
- Calliope Frame size larger than max length - posted by ericjohnston1989 <er...@gmail.com> on 2014/04/18 23:32:35 UTC, 1 replies.
- BFS implemented - posted by Ghufran Malik <go...@gmail.com> on 2014/04/19 00:15:27 UTC, 2 replies.
- Do developers have to be aware of Spark's fault tolerance mechanism? - posted by Sung Hwan Chung <co...@cs.stanford.edu> on 2014/04/19 02:11:42 UTC, 4 replies.
- Spark-ec2 asks for password - posted by Aureliano Buendia <bu...@gmail.com> on 2014/04/19 05:57:48 UTC, 5 replies.
- Combining RDD's columns - posted by Ian Ferreira <ia...@hotmail.com> on 2014/04/19 06:59:54 UTC, 1 replies.
- extremely slow k-means version - posted by ticup <ti...@gmail.com> on 2014/04/19 13:04:59 UTC, 2 replies.
- Help with error initializing SparkR. - posted by tongzzz <to...@gmail.com> on 2014/04/20 04:12:56 UTC, 2 replies.
- questions about toArray and ClassTag - posted by wxhsdp <wx...@gmail.com> on 2014/04/20 05:50:27 UTC, 0 replies.
- efficient joining - posted by Joe L <se...@yahoo.com> on 2014/04/20 05:55:44 UTC, 0 replies.
- Task splitting among workers - posted by David Thomas <dt...@gmail.com> on 2014/04/20 06:25:51 UTC, 2 replies.
- question about the SocketReceiver - posted by YouPeng Yang <yy...@gmail.com> on 2014/04/20 10:36:13 UTC, 1 replies.
- Spark recovery from bad nodes - posted by rama0120 <la...@gmail.com> on 2014/04/20 17:05:43 UTC, 2 replies.
- evaluate spark - posted by Joe L <se...@yahoo.com> on 2014/04/20 23:03:30 UTC, 0 replies.
- Hung inserts? - posted by Brad Heller <br...@gmail.com> on 2014/04/20 23:50:17 UTC, 4 replies.
- Long running time for GraphX pagerank in dataset com-Friendster - posted by Qi Song <so...@gmail.com> on 2014/04/21 03:18:18 UTC, 2 replies.
- Re: Are there any plans to develop Graphx Streaming? - posted by Qi Song <so...@gmail.com> on 2014/04/21 03:27:13 UTC, 1 replies.
- running tests selectively - posted by Arun Ramakrishnan <si...@gmail.com> on 2014/04/21 05:58:56 UTC, 3 replies.
- socketTextStream() call on Cluster stream no records - posted by "Kulkarni, Vikram" <vi...@hp.com> on 2014/04/21 09:15:17 UTC, 0 replies.
- Spark running slow for small hadoop files of 10 mb size - posted by neeravsalaria <ne...@gmail.com> on 2014/04/21 13:21:28 UTC, 2 replies.
- Do I need to learn Scala for spark ? - posted by arpan57 <ar...@gmail.com> on 2014/04/21 15:40:50 UTC, 2 replies.
- custom kryoserializer class under mesos - posted by Soren Macbeth <so...@yieldbot.com> on 2014/04/21 17:41:23 UTC, 0 replies.
- stdout in workers - posted by Jim Carroll <ji...@gmail.com> on 2014/04/21 19:59:51 UTC, 1 replies.
- Spark is slow - posted by Joe L <se...@yahoo.com> on 2014/04/21 20:23:16 UTC, 6 replies.
- spark-0.9.1 compiled with Hadoop 2.3.0 doesn't work with S3? - posted by Nan Zhu <zh...@gmail.com> on 2014/04/21 20:30:06 UTC, 2 replies.
- checkpointing without streaming? - posted by Diana Carroll <dc...@cloudera.com> on 2014/04/21 20:34:48 UTC, 4 replies.
- Problem connecting to HDFS in Spark shell - posted by "Williams, Ken" <Ke...@windlogics.com> on 2014/04/21 21:03:53 UTC, 3 replies.
- Spark Streaming source from Amazon Kinesis - posted by Nicholas Chammas <ni...@gmail.com> on 2014/04/21 22:00:13 UTC, 5 replies.
- [ann] Spark-NYC Meetup - posted by François Le Lay <fl...@spotify.com> on 2014/04/21 22:31:00 UTC, 1 replies.
- ERROR TaskSchedulerImpl: Lost an executor - posted by jaeholee <jh...@lbl.gov> on 2014/04/21 23:26:19 UTC, 16 replies.
- Adding to an RDD - posted by Ian Ferreira <ia...@hotmail.com> on 2014/04/22 03:33:38 UTC, 1 replies.
- two calls of saveAsTextFile() have different results on the same RDD - posted by randylu <ra...@gmail.com> on 2014/04/22 04:52:52 UTC, 6 replies.
- how to solve this problem? - posted by gogototo <wa...@gmail.com> on 2014/04/22 05:20:06 UTC, 2 replies.
- Need clarification of joining streams - posted by gaganbm <ga...@gmail.com> on 2014/04/22 06:25:26 UTC, 0 replies.
- Efficient Aggregation over DB data - posted by Sai Prasanna <an...@gmail.com> on 2014/04/22 11:02:39 UTC, 0 replies.
- Question about running spark on yarn - posted by Gordon Wang <gw...@gopivotal.com> on 2014/04/22 11:43:19 UTC, 5 replies.
- Bind exception while running FlumeEventCount - posted by NehaS Singh <Ne...@lntinfotech.com> on 2014/04/22 11:48:01 UTC, 1 replies.
- 'Filesystem closed' while running spark job - posted by Marcin Cylke <ma...@ext.allegro.pl> on 2014/04/22 12:28:15 UTC, 1 replies.
- help me - posted by Joe L <se...@yahoo.com> on 2014/04/22 13:15:11 UTC, 0 replies.
- Spark runs applications in an inconsistent way - posted by Aureliano Buendia <bu...@gmail.com> on 2014/04/22 16:00:05 UTC, 3 replies.
- Some questions in using Graphx - posted by wu zeming <ze...@gmail.com> on 2014/04/22 17:20:00 UTC, 3 replies.
- internship opportunity - posted by Tom Vacek <mi...@gmail.com> on 2014/04/22 18:27:04 UTC, 0 replies.
- Re: java.net.SocketException on reduceByKey() in pyspark - posted by benlaird <be...@capitalone.com> on 2014/04/22 21:43:44 UTC, 0 replies.
- Running large join in ALS example through PySpark - posted by "Laird, Benjamin" <Be...@capitalone.com> on 2014/04/22 22:07:30 UTC, 0 replies.
- Joining large dataset causes failure on Spark! - posted by Hasan Asfoor <as...@gmail.com> on 2014/04/22 23:47:25 UTC, 0 replies.
- GraphX: .edges.distinct().count() is 10? - posted by Ryan Compton <co...@gmail.com> on 2014/04/23 01:52:36 UTC, 3 replies.
- No configuration setting found for key 'akka.version' - posted by mbaryu <cb...@infoblox.com> on 2014/04/23 04:21:21 UTC, 0 replies.
- Custom KryoSerializer - posted by Soren Macbeth <so...@yieldbot.com> on 2014/04/23 05:17:29 UTC, 1 replies.
- no response in spark web UI - posted by wxhsdp <wx...@gmail.com> on 2014/04/23 05:21:49 UTC, 2 replies.
- Need help about how hadoop works. - posted by Carter <gy...@hotmail.com> on 2014/04/23 08:42:12 UTC, 6 replies.
- Accesing Hdfs from Spark gives TokenCache error "Can't get Master Kerberos principal for use as renewer" - posted by Spyros Gasteratos <sp...@gmail.com> on 2014/04/23 11:04:12 UTC, 0 replies.
- help - posted by Joe L <se...@yahoo.com> on 2014/04/23 11:04:52 UTC, 8 replies.
- Hadoop—streaming - posted by zhxfl <29...@qq.com> on 2014/04/23 11:09:20 UTC, 1 replies.
- about rdd.filter() - posted by randylu <ra...@gmail.com> on 2014/04/23 11:45:38 UTC, 5 replies.
- SparkException: env SPARK_YARN_APP_JAR is not set - posted by 肥肥 <19...@qq.com> on 2014/04/23 12:05:26 UTC, 0 replies.
- Comparing RDD Items - posted by Jared Rodriguez <jr...@kitedesk.com> on 2014/04/23 16:48:26 UTC, 1 replies.
- default spark partitioner - posted by Adrian Mocanu <am...@verticalscope.com> on 2014/04/23 17:21:58 UTC, 2 replies.
- skip lines in spark - posted by Chengi Liu <ch...@gmail.com> on 2014/04/23 18:18:01 UTC, 7 replies.
- Spark hangs when i call parallelize + count on a ArrayList having 40k elements - posted by amit karmakar <am...@gmail.com> on 2014/04/23 18:23:08 UTC, 1 replies.
- error in mllib lr example code - posted by Mohit Jaggi <mo...@gmail.com> on 2014/04/23 18:34:00 UTC, 4 replies.
- Is Spark a good choice for geospatial/GIS applications? Is a community volunteer needed in this area? - posted by neveroutgunned <ne...@hush.com> on 2014/04/23 20:00:37 UTC, 3 replies.
- understanding stages - posted by Diana Carroll <dc...@cloudera.com> on 2014/04/23 21:06:23 UTC, 5 replies.
- Re: ArrayIndexOutOfBoundsException in ALS.implicit - posted by Xiangrui Meng <me...@gmail.com> on 2014/04/23 21:52:39 UTC, 0 replies.
- How do I access the SPARK SQL - posted by diplomatic Guru <di...@gmail.com> on 2014/04/23 22:30:24 UTC, 10 replies.
- Failed to run count? - posted by Ian Ferreira <ia...@hotmail.com> on 2014/04/24 00:19:50 UTC, 1 replies.
- GraphX, Kryo and BoundedPriorityQueue? - posted by Ryan Compton <co...@gmail.com> on 2014/04/24 00:33:32 UTC, 0 replies.
- RE: - posted by "Buttler, David" <bu...@llnl.gov> on 2014/04/24 03:10:38 UTC, 0 replies.
- GraphX: Help understanding the limitations of Pregel - posted by Ryan Compton <co...@gmail.com> on 2014/04/24 03:20:11 UTC, 3 replies.
- how to set spark.executor.memory and heap size - posted by wxhsdp <wx...@gmail.com> on 2014/04/24 05:21:36 UTC, 18 replies.
- Access Last Element of RDD - posted by Sai Prasanna <an...@gmail.com> on 2014/04/24 06:51:13 UTC, 12 replies.
- Re: SparkPi performance-3 cluster standalone mode - posted by Adnan <ns...@gmail.com> on 2014/04/24 13:43:40 UTC, 1 replies.
- Deploying a python code on a spark cluster - posted by Shubhabrata Roy <sh...@realeyesit.com> on 2014/04/24 15:40:59 UTC, 0 replies.
- Deploying a python code on a spark EC2 cluster - posted by Shubhabrata <ma...@gmail.com> on 2014/04/24 15:46:03 UTC, 8 replies.
- How to see org.apache.spark.executor.Executor logs - posted by amit karmakar <am...@gmail.com> on 2014/04/24 16:58:52 UTC, 0 replies.
- reduceByKeyAndWindow - spark internals - posted by Adrian Mocanu <am...@verticalscope.com> on 2014/04/24 17:26:08 UTC, 1 replies.
- IDE for sparkR - posted by phoenixbai <ph...@126.com> on 2014/04/24 17:43:52 UTC, 1 replies.
- Trying to use pyspark mllib NaiveBayes - posted by John King <us...@gmail.com> on 2014/04/24 20:38:57 UTC, 5 replies.
- Spark mllib throwing error - posted by John King <us...@gmail.com> on 2014/04/24 20:49:59 UTC, 7 replies.
- spark mllib to jblas calls..and comparison with VW - posted by Mohit Jaggi <mo...@ayasdi.com> on 2014/04/25 00:14:52 UTC, 1 replies.
- compile spark 0.9.1 in hadoop 2.2 above exception - posted by "martin.ou" <Ma...@orchestrallinc.cn> on 2014/04/25 02:33:08 UTC, 1 replies.
- Finding bad data - posted by Jim Blomo <ji...@gmail.com> on 2014/04/25 03:15:04 UTC, 1 replies.
- parallelize for a large Seq is extreamly slow. - posted by Earthson Lu <ea...@gmail.com> on 2014/04/25 05:01:46 UTC, 11 replies.
- Monitoring the total network I/O done by an application? - posted by Semih Salihoglu <se...@stanford.edu> on 2014/04/25 05:24:29 UTC, 0 replies.
- what is the best way to do cartesian - posted by Qin Wei <we...@dewmobile.net> on 2014/04/25 06:05:40 UTC, 3 replies.
- Problem with the Item-Based Collaborative Filtering Recommendation Algorithms in spark - posted by Qin Wei <we...@dewmobile.net> on 2014/04/25 06:55:26 UTC, 5 replies.
- Spark reads partitions in a wrong order - posted by Mingyu Kim <mk...@palantir.com> on 2014/04/25 09:10:57 UTC, 1 replies.
- MultipleOutputs IdentityReducer - posted by Andre Kuhnen <an...@gmail.com> on 2014/04/25 12:51:27 UTC, 2 replies.
- read file from hdfs - posted by Joe L <se...@yahoo.com> on 2014/04/25 14:38:03 UTC, 1 replies.
- Questions about productionizing spark - posted by Han JU <ju...@gmail.com> on 2014/04/25 14:39:28 UTC, 0 replies.
- strange error - posted by Joe L <se...@yahoo.com> on 2014/04/25 16:10:43 UTC, 0 replies.
- Securing Spark's Network - posted by Jacob Eisinger <je...@us.ibm.com> on 2014/04/25 17:23:01 UTC, 4 replies.
- Spark & Shark 0.9.1 on ec2 with Hadoop 2 error - posted by jesseerdmann <je...@umn.edu> on 2014/04/25 19:46:53 UTC, 1 replies.
- Strange lookup behavior. Possible bug? - posted by Yadid Ayzenberg <ya...@media.mit.edu> on 2014/04/25 19:55:23 UTC, 5 replies.
- Build times for Spark - posted by "Williams, Ken" <Ke...@windlogics.com> on 2014/04/25 21:53:45 UTC, 7 replies.
- Scala Spark / Shark: How to access existing Hive tables in Hortonworks? - posted by Darq Moth <da...@gmail.com> on 2014/04/25 22:30:59 UTC, 2 replies.
- Running out of memory Naive Bayes - posted by John King <us...@gmail.com> on 2014/04/26 04:06:37 UTC, 9 replies.
- Question about Transforming huge files from Local to HDFS - posted by PengWeiPRC <pe...@gmx.com> on 2014/04/26 05:50:33 UTC, 1 replies.
- how to get subArray without copy - posted by wxhsdp <wx...@gmail.com> on 2014/04/26 11:26:26 UTC, 1 replies.
- Parquet-SPARK-PIG integration. - posted by suman bharadwaj <su...@gmail.com> on 2014/04/26 12:01:20 UTC, 1 replies.
- is it okay to reuse objects across RDD's? - posted by "Lisonbee, Todd" <to...@intel.com> on 2014/04/26 14:59:22 UTC, 16 replies.
- Using Spark in IntelliJ Scala Console - posted by Jonathan Chayat <jo...@supersonicads.com> on 2014/04/26 19:47:34 UTC, 5 replies.
- questions about debugging a spark application - posted by wxhsdp <wx...@gmail.com> on 2014/04/27 05:19:07 UTC, 6 replies.
- Any advice for using big spark.cleaner.delay value in Spark Streaming? - posted by buremba <em...@gmail.com> on 2014/04/27 14:40:49 UTC, 1 replies.
- Running a spark-submit compatible app in spark-shell - posted by Roger Hoover <ro...@gmail.com> on 2014/04/28 00:14:05 UTC, 5 replies.
- spark running examples error - posted by Joe L <se...@yahoo.com> on 2014/04/28 04:32:13 UTC, 0 replies.
- NullPointerException when run SparkPI using YARN env - posted by "martin.ou" <Ma...@orchestrallinc.cn> on 2014/04/28 04:40:30 UTC, 1 replies.
- Spark with Parquet - posted by Sai Prasanna <an...@gmail.com> on 2014/04/28 08:41:47 UTC, 1 replies.
- Shuffle Spill Issue - posted by "Liu, Raymond" <ra...@intel.com> on 2014/04/28 10:18:35 UTC, 5 replies.
- Spark 1.0 run job fail - posted by "Shihaoliang (Shihaoliang)" <sh...@huawei.com> on 2014/04/28 11:32:31 UTC, 0 replies.
- getting an error - posted by Joe L <se...@yahoo.com> on 2014/04/28 11:43:22 UTC, 0 replies.
- what does broadcast_0 stand for - posted by wxhsdp <wx...@gmail.com> on 2014/04/28 12:15:01 UTC, 4 replies.
- NoSuchMethodError from Spark Java - posted by Jared Rodriguez <jr...@kitedesk.com> on 2014/04/28 12:59:30 UTC, 6 replies.
- Cannot compile SIMR with Spark 9.1 - posted by lukas nalezenec <lu...@gmail.com> on 2014/04/28 14:10:15 UTC, 0 replies.
- Java Spark Streaming - SparkFlumeEvent - posted by "Kulkarni, Vikram" <vi...@hp.com> on 2014/04/28 14:46:11 UTC, 2 replies.
- Running parallel jobs in the same driver with Futures? - posted by Ian Ferreira <ia...@hotmail.com> on 2014/04/28 17:38:29 UTC, 0 replies.
- MLLib - libgfortran LD_LIBRARY_PATH - posted by Shubham Chopra <sh...@gmail.com> on 2014/04/28 18:16:29 UTC, 2 replies.
- K-means with large K - posted by "Buttler, David" <bu...@llnl.gov> on 2014/04/28 18:19:24 UTC, 4 replies.
- What is the recommended way to store state across RDDs? - posted by Adrian Mocanu <am...@verticalscope.com> on 2014/04/28 18:44:12 UTC, 1 replies.
- running SparkALS - posted by Diana Carroll <dc...@cloudera.com> on 2014/04/28 19:30:09 UTC, 8 replies.
- Read from list of files in parallel - posted by Pat Ferrel <pa...@gmail.com> on 2014/04/28 19:58:29 UTC, 0 replies.
- Spark 0.9.1 -- assembly fails? - posted by kamatsuoka <ke...@gmail.com> on 2014/04/28 22:49:51 UTC, 2 replies.
- File list read into single RDD - posted by Pat Ferrel <pa...@gmail.com> on 2014/04/29 01:23:58 UTC, 4 replies.
- processing s3n:// files in parallel - posted by Art Peel <fo...@gmail.com> on 2014/04/29 01:35:23 UTC, 6 replies.
- launching concurrent jobs programmatically - posted by ishaaq <is...@gmail.com> on 2014/04/29 01:39:54 UTC, 3 replies.
- Re: how to declare tuple return type - posted by wxhsdp <wx...@gmail.com> on 2014/04/29 01:46:28 UTC, 0 replies.
- How to declare Tuple return type for a function - posted by SK <sk...@gmail.com> on 2014/04/29 03:22:02 UTC, 2 replies.
- question on setup() and cleanup() methods for map() and reduce() - posted by "Parsian, Mahmoud" <mp...@illumina.com> on 2014/04/29 03:22:20 UTC, 2 replies.
- Shuffle phase is very slow, any help, thx! - posted by gogototo <wa...@gmail.com> on 2014/04/29 03:41:15 UTC, 1 replies.
- Why Spark require this object to be serializerable? - posted by Earthson <Ea...@gmail.com> on 2014/04/29 05:46:45 UTC, 7 replies.
- How to run spark well on yarn - posted by Sophia <sl...@163.com> on 2014/04/29 07:59:33 UTC, 0 replies.
- Issue during Spark streaming with ZeroMQ source - posted by "Francis.Hu" <fr...@reachjunction.com> on 2014/04/29 08:59:59 UTC, 1 replies.
- Spark RDD cache memory usage - posted by Han JU <ju...@gmail.com> on 2014/04/29 10:24:02 UTC, 1 replies.
- 答复: Issue during Spark streaming with ZeroMQ source - posted by "Francis.Hu" <fr...@reachjunction.com> on 2014/04/29 11:14:38 UTC, 1 replies.
- Joining not-pair RDDs in Spark - posted by jsantos <js...@tecsisa.com> on 2014/04/29 11:55:40 UTC, 2 replies.
- User/Product Clustering with pySpark ALS - posted by "Laird, Benjamin" <Be...@capitalone.com> on 2014/04/29 16:29:07 UTC, 1 replies.
- Storage information about an RDD from the API - posted by Andras Nemeth <an...@lynxanalytics.com> on 2014/04/29 18:34:57 UTC, 1 replies.
- Python Spark on YARN - posted by Guanhua Yan <gh...@lanl.gov> on 2014/04/29 18:51:18 UTC, 2 replies.
- What is Seq[V] in updateStateByKey? - posted by Adrian Mocanu <am...@verticalscope.com> on 2014/04/29 20:50:26 UTC, 5 replies.
- packaging time - posted by SK <sk...@gmail.com> on 2014/04/29 20:50:49 UTC, 2 replies.
- Delayed Scheduling - Setting spark.locality.wait.node parameter in interactive shell - posted by Sai Prasanna <an...@gmail.com> on 2014/04/29 21:04:32 UTC, 0 replies.
- Spark: issues with running a sbt fat jar due to akka dependencies - posted by Shivani Rao <ra...@gmail.com> on 2014/04/29 21:32:26 UTC, 1 replies.
- java.lang.ClassCastException for groupByKey - posted by amit karmakar <am...@gmail.com> on 2014/04/29 22:13:27 UTC, 0 replies.
- Spark cluster standalone setup - posted by pradeep_s <sr...@gmail.com> on 2014/04/29 23:22:46 UTC, 0 replies.
- Spark's behavior - posted by Eduardo Costa Alfaia <e....@unibs.it> on 2014/04/29 23:28:12 UTC, 4 replies.
- rdd ordering gets scrambled - posted by Mohit Jaggi <mo...@gmail.com> on 2014/04/29 23:44:44 UTC, 0 replies.
- Re: Spark cluster standalone setup memory issue - posted by pradeep_s <sr...@gmail.com> on 2014/04/30 00:16:19 UTC, 0 replies.
- About pluggable storage roadmap? - posted by "Liu, Raymond" <ra...@intel.com> on 2014/04/30 04:26:53 UTC, 0 replies.
- sparkR - is it possible to run sparkR on yarn? - posted by phoenix bai <mi...@gmail.com> on 2014/04/30 04:56:16 UTC, 1 replies.
- JavaSparkConf - posted by Soren Macbeth <so...@yieldbot.com> on 2014/04/30 05:32:20 UTC, 2 replies.
- How fast would you expect shuffle serialize to be? - posted by "Liu, Raymond" <ra...@intel.com> on 2014/04/30 06:34:15 UTC, 6 replies.
- Re: Union of 2 RDD's only returns the first one - posted by Mingyu Kim <mk...@palantir.com> on 2014/04/30 07:22:01 UTC, 8 replies.
- Setting spark.locality.wait.node parameter in interactive shell - posted by Sai Prasanna <an...@gmail.com> on 2014/04/30 08:10:46 UTC, 0 replies.
- the spark configuage - posted by Sophia <sl...@163.com> on 2014/04/30 09:58:28 UTC, 5 replies.
- something about memory usage - posted by wxhsdp <wx...@gmail.com> on 2014/04/30 13:52:57 UTC, 1 replies.
- Can a job running on a cluster read from a local file path ? - posted by Shubhabrata <ma...@gmail.com> on 2014/04/30 16:16:41 UTC, 0 replies.
- new Washington DC Area Spark Meetup - posted by "Donna-M. Fernandez" <do...@metistream.com> on 2014/04/30 16:34:47 UTC, 0 replies.
- Reading multiple S3 objects, transforming, writing back one - posted by Peter <th...@yahoo.com> on 2014/04/30 20:14:44 UTC, 3 replies.