You are viewing a plain text version of this content. The canonical link for it is here.
- Re: how to get actual count from as long from JavaDStream ? - posted by Tathagata Das <ta...@gmail.com> on 2014/10/01 00:07:26 UTC, 5 replies.
- Re: processing large number of files - posted by Liquan Pei <li...@gmail.com> on 2014/10/01 00:11:30 UTC, 0 replies.
- Re: Multiple exceptions in Spark Streaming - posted by Tathagata Das <ta...@gmail.com> on 2014/10/01 00:16:50 UTC, 5 replies.
- RDD not getting generated - posted by vvarma <vi...@gmail.com> on 2014/10/01 00:27:42 UTC, 0 replies.
- Re: Handling tree reduction algorithm with Spark in parallel - posted by Debasish Das <de...@gmail.com> on 2014/10/01 00:44:52 UTC, 4 replies.
- apache spark union function cause executors disassociate (Lost executor 1 on 172.32.1.12: remote Akka client disassociated) - posted by Edwin <al...@yahoo.com> on 2014/10/01 00:55:51 UTC, 2 replies.
- memory vs data_size - posted by anny9699 <an...@gmail.com> on 2014/10/01 01:11:11 UTC, 2 replies.
- A sample for generating big data - and some design questions - posted by Steve Lewis <lo...@gmail.com> on 2014/10/01 02:16:51 UTC, 0 replies.
- Poor performance writing to S3 - posted by Gustavo Arjones <ga...@socialmetrix.com> on 2014/10/01 03:03:55 UTC, 1 replies.
- Re: Short Circuit Local Reads - posted by Andrew Ash <an...@andrewash.com> on 2014/10/01 03:28:01 UTC, 1 replies.
- Re: Simple Question: Spark Streaming Applications - posted by Tobias Pfeiffer <tg...@preferred.jp> on 2014/10/01 03:39:08 UTC, 1 replies.
- Re: Reading from HBase is too slow - posted by Ted Yu <yu...@gmail.com> on 2014/10/01 06:04:34 UTC, 9 replies.
- Re: Spark 1.1.0 hbase_inputformat.py not work - posted by Kan Zhang <kz...@apache.org> on 2014/10/01 06:37:49 UTC, 2 replies.
- How to get SparckContext inside mapPartitions? - posted by Henry Hung <YT...@winbond.com> on 2014/10/01 06:52:14 UTC, 1 replies.
- IPython Notebook Debug Spam - posted by Rick Richardson <ri...@gmail.com> on 2014/10/01 07:14:00 UTC, 7 replies.
- How to read just specified columns from parquet file using SparkSQL. - posted by mykidong <my...@gmail.com> on 2014/10/01 07:31:50 UTC, 0 replies.
- Dynamic visualizations from Spark Streaming output? - posted by YaoPau <jo...@gmail.com> on 2014/10/01 08:27:13 UTC, 1 replies.
- Re: sbt run with spark.ContextCleaner ERROR - posted by Grega Kešpret <gr...@celtra.com> on 2014/10/01 09:03:40 UTC, 0 replies.
- persistent state for spark streaming - posted by Chia-Chun Shih <ch...@gmail.com> on 2014/10/01 09:49:33 UTC, 3 replies.
- any code examples demonstrating spark streaming applications which depend on states? - posted by Chia-Chun Shih <ch...@gmail.com> on 2014/10/01 10:13:39 UTC, 2 replies.
- Spark AccumulatorParam generic - posted by Johan Stenberg <jo...@gmail.com> on 2014/10/01 12:33:12 UTC, 2 replies.
- Re: Installation question - posted by Sean Owen <so...@cloudera.com> on 2014/10/01 13:25:50 UTC, 0 replies.
- Re: Spark SQL + Hive + JobConf NoClassDefFoundError - posted by Patrick McGloin <mc...@gmail.com> on 2014/10/01 13:35:31 UTC, 0 replies.
- Re: MLLib ALS question - posted by Alex T <ch...@gmail.com> on 2014/10/01 13:36:26 UTC, 1 replies.
- Re: Spark Streaming for time consuming job - posted by Mayur Rustagi <ma...@gmail.com> on 2014/10/01 13:59:02 UTC, 1 replies.
- Solution for small files in HDFS - posted by rzykov <rz...@gmail.com> on 2014/10/01 14:07:10 UTC, 0 replies.
- KryoSerializer exception in Spark Streaming JAVA - posted by Mudassar Sarwar <mu...@northbaysolutions.net> on 2014/10/01 15:13:02 UTC, 0 replies.
- Problem with very slow behaviour of TorrentBroadcast vs. HttpBroadcast - posted by Guillaume Pitel <gu...@exensa.com> on 2014/10/01 16:31:46 UTC, 0 replies.
- GraphX: Types for the Nodes and Edges - posted by Oshi <os...@gmail.com> on 2014/10/01 16:35:31 UTC, 4 replies.
- MultipleTextOutputFormat with new hadoop API - posted by Tomer Benyamini <to...@gmail.com> on 2014/10/01 16:53:06 UTC, 4 replies.
- Relation between worker memory and executor memory in standalone mode - posted by Akshat Aranya <aa...@gmail.com> on 2014/10/01 17:04:38 UTC, 7 replies.
- org.apache.spark.sql.catalyst.errors.package$TreeNodeException: - posted by tonsat <to...@gmail.com> on 2014/10/01 17:09:18 UTC, 1 replies.
- Re: MLLib: Missing value imputation - posted by Debasish Das <de...@gmail.com> on 2014/10/01 17:35:51 UTC, 1 replies.
- spark.driver.memory is not set (pyspark, 1.1.0) - posted by jamborta <ja...@gmail.com> on 2014/10/01 18:26:14 UTC, 9 replies.
- can I think of JavaDStream<> foreachRDD() as being 'for each mini batch' ? - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2014/10/01 19:51:39 UTC, 0 replies.
- Re: can I think of JavaDStream<> foreachRDD() as being 'for each mini batch' ? - posted by Sean Owen <so...@cloudera.com> on 2014/10/01 20:27:49 UTC, 1 replies.
- Protocol buffers with Spark ? - posted by Jaonary Rabarisoa <ja...@gmail.com> on 2014/10/01 20:35:41 UTC, 0 replies.
- Task deserialization problem using 1.1.0 for Hadoop 2.4 - posted by Timothy Potter <th...@gmail.com> on 2014/10/01 21:43:20 UTC, 1 replies.
- MLlib Linear Regression Mismatch - posted by Krishna Sankar <ks...@gmail.com> on 2014/10/01 21:43:20 UTC, 2 replies.
- Spark Monitoring with Ganglia - posted by danilopds <da...@gmail.com> on 2014/10/01 22:30:04 UTC, 3 replies.
- Re: Question About Submit Application - posted by danilopds <da...@gmail.com> on 2014/10/01 22:31:30 UTC, 0 replies.
- still "GC overhead limit exceeded" after increasing heap space - posted by anny9699 <an...@gmail.com> on 2014/10/01 22:37:18 UTC, 8 replies.
- run scalding on spark - posted by Koert Kuipers <ko...@tresata.com> on 2014/10/01 22:41:29 UTC, 2 replies.
- Print Decision Tree Models - posted by Jimmy McErlain <ji...@sellpoints.com> on 2014/10/01 23:13:28 UTC, 2 replies.
- Creating a feature vector from text before using with MLLib - posted by Soumya Simanta <so...@gmail.com> on 2014/10/01 23:18:32 UTC, 2 replies.
- Determining number of executors within RDD - posted by Akshat Aranya <aa...@gmail.com> on 2014/10/01 23:29:51 UTC, 0 replies.
- Spark And Mapr - posted by "Addanki, Santosh Kumar" <sa...@sap.com> on 2014/10/02 00:00:55 UTC, 5 replies.
- Multiple spark shell sessions - posted by Sanjay Subramanian <sa...@yahoo.com.INVALID> on 2014/10/02 01:29:31 UTC, 2 replies.
- Spark inside Eclipse - posted by Sanjay Subramanian <sa...@yahoo.com.INVALID> on 2014/10/02 01:35:23 UTC, 6 replies.
- Re: timestamp not implemented yet - posted by "barge.nilesh" <ba...@gmail.com> on 2014/10/02 02:09:46 UTC, 1 replies.
- spark.cleaner.ttl - posted by SK <sk...@gmail.com> on 2014/10/02 02:27:09 UTC, 0 replies.
- What can be done if a FlatMapFunctions generated more data that can be held in memory - posted by Steve Lewis <lo...@gmail.com> on 2014/10/02 03:01:58 UTC, 1 replies.
- Help Troubleshooting Naive Bayes - posted by Mike Bernico <mi...@gmail.com> on 2014/10/02 03:31:45 UTC, 4 replies.
- Issue with Partitioning - posted by Ankur Srivastava <an...@gmail.com> on 2014/10/02 07:12:17 UTC, 2 replies.
- Implicit conversion RDD -> SchemaRDD - posted by Stephen Boesch <ja...@gmail.com> on 2014/10/02 11:00:50 UTC, 2 replies.
- Confusion over how to deploy/run JAR files to a Spark Cluster - posted by Mark Mandel <ma...@gmail.com> on 2014/10/02 11:14:14 UTC, 3 replies.
- how to send message to specific vertex by Pregel api - posted by Yifan LI <ia...@gmail.com> on 2014/10/02 12:42:29 UTC, 0 replies.
- Re: registering Array of CompactBuffer to Kryo - posted by Daniel Darabos <da...@lynxanalytics.com> on 2014/10/02 12:50:07 UTC, 1 replies.
- Is there a way to provide individual property to each Spark executor? - posted by Vladimir Tretyakov <vl...@sematext.com> on 2014/10/02 14:25:14 UTC, 1 replies.
- [SparkSQL] Function parity with Shark? - posted by Yana Kadiyska <ya...@gmail.com> on 2014/10/02 15:56:29 UTC, 4 replies.
- partition size for initial read - posted by jamborta <ja...@gmail.com> on 2014/10/02 16:00:19 UTC, 3 replies.
- Type problem in Java when using flatMapValues - posted by Robin Keunen <ro...@lampiris.be> on 2014/10/02 17:15:25 UTC, 2 replies.
- Re: SparkSQL DataType mappings - posted by Yin Huai <hu...@gmail.com> on 2014/10/02 17:32:29 UTC, 1 replies.
- Re: Kafka Spark Streaming job has an issue when the worker reading from Kafka is killed - posted by maddenpj <ma...@gmail.com> on 2014/10/02 18:45:49 UTC, 1 replies.
- weird YARN errors on new Spark on Yarn cluster - posted by Greg Hill <gr...@RACKSPACE.COM> on 2014/10/02 18:47:35 UTC, 3 replies.
- Fwd: Second Bay Area Tachyon meetup: October 21st, hosted by Pivotal (Limited Space) - posted by Haoyuan Li <ha...@gmail.com> on 2014/10/02 19:26:27 UTC, 0 replies.
- Larger heap leads to perf degradation due to GC - posted by Mingyu Kim <mk...@palantir.com> on 2014/10/02 19:30:46 UTC, 6 replies.
- GraphX Java API Timeline - posted by "Adams, Jeremiah" <je...@pearson.com> on 2014/10/02 20:10:39 UTC, 2 replies.
- Application details for failed and teminated jobs - posted by SK <sk...@gmail.com> on 2014/10/02 20:31:12 UTC, 0 replies.
- Re: Application details for failed and teminated jobs - posted by Marcelo Vanzin <va...@cloudera.com> on 2014/10/02 20:35:04 UTC, 0 replies.
- Re: Can not see any spark metrics on ganglia-web - posted by danilopds <da...@gmail.com> on 2014/10/02 21:29:42 UTC, 2 replies.
- Block removal causes Akka timeouts - posted by maddenpj <ma...@gmail.com> on 2014/10/02 21:30:22 UTC, 0 replies.
- Sorting a Sequence File - posted by jritz <jm...@gmail.com> on 2014/10/02 21:32:17 UTC, 1 replies.
- HiveContext: cache table not supported for partitioned table? - posted by Du Li <li...@yahoo-inc.com.INVALID> on 2014/10/02 21:39:05 UTC, 2 replies.
- Spark SQL: ArrayIndexOutofBoundsException - posted by SK <sk...@gmail.com> on 2014/10/03 00:35:04 UTC, 5 replies.
- Strategies for reading large numbers of files - posted by Landon Kuhn <la...@janrain.com> on 2014/10/03 01:10:21 UTC, 6 replies.
- how to debug ExecutorLostFailure - posted by jamborta <ja...@gmail.com> on 2014/10/03 01:25:06 UTC, 1 replies.
- Getting table info from HiveContext - posted by Banias <ca...@yahoo.com.INVALID> on 2014/10/03 02:17:49 UTC, 2 replies.
- Load multiple parquet file as single RDD - posted by Mohnish Kodnani <mo...@gmail.com> on 2014/10/03 03:05:13 UTC, 1 replies.
- Re: Any issues with repartition? - posted by jamborta <ja...@gmail.com> on 2014/10/03 04:22:00 UTC, 5 replies.
- Re: new error for me - posted by jamborta <ja...@gmail.com> on 2014/10/03 04:23:32 UTC, 1 replies.
- How to make ./bin/spark-sql work with hive? - posted by Li HM <hm...@gmail.com> on 2014/10/03 05:44:45 UTC, 11 replies.
- Setup/Cleanup for RDD closures? - posted by Stephen Boesch <ja...@gmail.com> on 2014/10/03 06:46:39 UTC, 2 replies.
- SparkSQL on Hive error - posted by Kevin Paul <ke...@gmail.com> on 2014/10/03 10:43:41 UTC, 4 replies.
- spark 1.1.0 - hbase 0.98.6-hadoop2 version - py4j.protocol.Py4JJavaError java.lang.ClassNotFoundException - posted by "serkan.dogan" <fo...@yahoo.com> on 2014/10/03 10:53:33 UTC, 3 replies.
- Akka "connection refused" when running standalone Scala app on Spark 0.9.2 - posted by Irina Fedulova <fe...@gmail.com> on 2014/10/03 11:32:45 UTC, 4 replies.
- The question about mount ephemeral disk in slave-setup.sh - posted by TANG Gen <ge...@keyrus.com> on 2014/10/03 12:05:32 UTC, 1 replies.
- Could Spark make use of Intel Xeon Phi? - posted by 余 浪 <yu...@gmail.com> on 2014/10/03 12:09:11 UTC, 4 replies.
- Breeze Library usage in Spark - posted by Priya Ch <le...@gmail.com> on 2014/10/03 13:22:44 UTC, 3 replies.
- How to save Spark log into file - posted by "Arthur.hk.chan@gmail.com" <ar...@gmail.com> on 2014/10/03 13:43:00 UTC, 0 replies.
- Re: Akka Connection refused - standalone cluster using spark-0.9.0 - posted by irina <fe...@gmail.com> on 2014/10/03 16:38:11 UTC, 0 replies.
- Question about addFiles() - posted by Tom Weber <To...@sas.com> on 2014/10/03 17:14:44 UTC, 0 replies.
- Using GraphX with Spark Streaming? - posted by Arko Provo Mukherjee <ar...@gmail.com> on 2014/10/03 18:40:43 UTC, 2 replies.
- MLlib Collaborative Filtering failed to run with rank 1000 - posted by "jw.cmu" <ji...@gmail.com> on 2014/10/03 19:17:42 UTC, 3 replies.
- [ANN] SparkSQL support for Cassandra with Calliope - posted by Rohit Rai <ro...@tuplejump.com> on 2014/10/03 20:15:24 UTC, 3 replies.
- array size limit vs partition number - posted by anny9699 <an...@gmail.com> on 2014/10/03 20:28:30 UTC, 0 replies.
- Re: window every n elements instead of time based - posted by Michael Allman <mi...@videoamp.com> on 2014/10/03 22:09:41 UTC, 4 replies.
- pyspark on python 3 - posted by Ariel Rokem <ar...@gmail.com> on 2014/10/03 23:35:04 UTC, 3 replies.
- Worker with no Executor (YARN client-mode) - posted by "jonathan.keebler" <jk...@gmail.com> on 2014/10/03 23:59:36 UTC, 1 replies.
- Re: partitions number with variable number of cores - posted by Gen <ge...@gmail.com> on 2014/10/04 00:06:06 UTC, 0 replies.
- problem with user@spark.apache.org spam filter - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2014/10/04 00:32:51 UTC, 1 replies.
- Accumulator question - posted by Nathan Kronenfeld <nk...@oculusinfo.com> on 2014/10/04 00:41:10 UTC, 0 replies.
- My task is finished successfully, however, I find some exceptions in webpage. - posted by Tim Chou <ti...@gmail.com> on 2014/10/04 00:46:03 UTC, 3 replies.
- any good library to implement multilabel classification on spark? - posted by critikaled <is...@gmail.com> on 2014/10/04 00:58:05 UTC, 0 replies.
- Spark Streaming writing to HDFS - posted by Abraham Jacob <ab...@gmail.com> on 2014/10/04 02:08:55 UTC, 3 replies.
- Null values in Date field only when RDD is saved as File. - posted by Manas Kar <ma...@gmail.com> on 2014/10/04 05:09:02 UTC, 2 replies.
- Trouble getting filtering on field correct - posted by Chop <th...@att.net> on 2014/10/04 05:16:08 UTC, 1 replies.
- Re: android + spark streaming? - posted by ll <du...@gmail.com> on 2014/10/04 08:43:21 UTC, 1 replies.
- scala Vector vs mllib Vector - posted by ll <du...@gmail.com> on 2014/10/04 08:44:18 UTC, 3 replies.
- Impala comparisons - posted by Debasish Das <de...@gmail.com> on 2014/10/04 18:42:52 UTC, 0 replies.
- org/apache/commons/math3/random/RandomGenerator issue - posted by anny9699 <an...@gmail.com> on 2014/10/04 21:59:34 UTC, 8 replies.
- Using FunSuite to test Spark throws NullPointerException - posted by Mario Pastorelli <ma...@teralytics.ch> on 2014/10/04 22:40:04 UTC, 2 replies.
- Dstream Transformations - posted by "Jahagirdar, Madhu" <ma...@philips.com> on 2014/10/05 01:49:37 UTC, 4 replies.
- Asynchronous Broadcast from driver to workers, is it possible? - posted by Peng Cheng <pc...@uow.edu.au> on 2014/10/05 04:36:50 UTC, 3 replies.
- mllib sparse vector/matrix vs. graphx graph - posted by ll <du...@gmail.com> on 2014/10/05 06:39:26 UTC, 1 replies.
- Re: New sbt plugin to deploy jobs to EC2 - posted by Felix Garcia Borrego <fb...@gilt.com> on 2014/10/05 16:59:38 UTC, 0 replies.
- java.library.path - posted by Tom <th...@gmail.com> on 2014/10/05 19:06:25 UTC, 1 replies.
- Kafka->HDFS to store as Parquet format - posted by bdev <bu...@gmail.com> on 2014/10/05 20:04:06 UTC, 5 replies.
- Stucked job work well after rdd.count or rdd.collect - posted by Kevin Jung <it...@samsung.com> on 2014/10/06 04:54:01 UTC, 0 replies.
- graphx - mutable? - posted by ll <du...@gmail.com> on 2014/10/06 07:38:31 UTC, 5 replies.
- Re: Stucked job works well after rdd.count or rdd.collect - posted by Kevin Jung <it...@samsung.com> on 2014/10/06 11:14:30 UTC, 0 replies.
- Spark SQL - custom aggregation function (UDAF) - posted by Pei-Lun Lee <pl...@appier.com> on 2014/10/06 11:18:16 UTC, 4 replies.
- Spark and Python using generator of data bigger than RAM as input to sc.parallelize() - posted by ja...@centrum.cz on 2014/10/06 16:16:18 UTC, 6 replies.
- Re: return probability \ confidence instead of actual class - posted by Adamantios Corais <ad...@gmail.com> on 2014/10/06 19:15:10 UTC, 7 replies.
- Spark Streaming saveAsNewAPIHadoopFiles - posted by Abraham Jacob <ab...@gmail.com> on 2014/10/06 20:39:49 UTC, 3 replies.
- combining python and java in a single Spark application - posted by Yadid Ayzenberg <ya...@media.mit.edu> on 2014/10/06 20:43:51 UTC, 0 replies.
- Is RDD partition index consistent? - posted by Sung Hwan Chung <co...@cs.stanford.edu> on 2014/10/06 21:33:22 UTC, 1 replies.
- lazy evaluation of RDD transformation - posted by anny9699 <an...@gmail.com> on 2014/10/06 22:56:46 UTC, 2 replies.
- Spark Standalone on EC2 - posted by Ankur Srivastava <an...@gmail.com> on 2014/10/07 00:06:30 UTC, 3 replies.
- How to make Spark-sql join using HashJoin - posted by Benyi Wang <be...@gmail.com> on 2014/10/07 01:20:34 UTC, 1 replies.
- performance comparison: join vs cogroup? - posted by freedafeng <fr...@yahoo.com> on 2014/10/07 02:16:50 UTC, 0 replies.
- Spark 1.1.0 with Hadoop 2.5.0 - posted by hmxxyy <hm...@gmail.com> on 2014/10/07 07:07:26 UTC, 7 replies.
- HiveServer1 and SparkSQL - posted by "deenar.toraskar" <de...@db.com> on 2014/10/07 09:39:24 UTC, 0 replies.
- Parsing one big multiple line .xml loaded in RDD using Python - posted by ja...@centrum.cz on 2014/10/07 10:06:48 UTC, 2 replies.
- Re: Hive Parquet Serde from Spark - posted by quintona <qu...@gmail.com> on 2014/10/07 12:27:08 UTC, 0 replies.
- Cannot read from s3 using "sc.textFile" - posted by Tomer Benyamini <to...@gmail.com> on 2014/10/07 13:15:22 UTC, 3 replies.
- Same code --works in spark 1.0.2-- but not in spark 1.1.0 - posted by MEETHU MATHEW <me...@yahoo.co.in> on 2014/10/07 14:27:44 UTC, 2 replies.
- Re: Unable to ship external Python libraries in PYSPARK - posted by yh18190 <yh...@gmail.com> on 2014/10/07 15:14:43 UTC, 0 replies.
- Re: Spark SQL -- more than two tables for join - posted by TANG Gen <ge...@keyrus.com> on 2014/10/07 16:11:20 UTC, 2 replies.
- akka.remote.transport.netty.NettyTransport - posted by Jacob Chacko - Catalyst Consulting <ja...@catalystconsulting.be> on 2014/10/07 17:09:00 UTC, 1 replies.
- Re: Schema change on Spark Hive (Parquet file format) table not working - posted by "barge.nilesh" <ba...@gmail.com> on 2014/10/07 18:33:39 UTC, 0 replies.
- Spark Streaming Fault Tolerance (?) - posted by Massimiliano Tomassi <ma...@gmail.com> on 2014/10/07 19:17:47 UTC, 1 replies.
- Stupid Spark question - posted by Steve Lewis <lo...@gmail.com> on 2014/10/07 20:01:22 UTC, 1 replies.
- dynamic sliding window duration - posted by Josh J <jo...@gmail.com> on 2014/10/07 21:50:39 UTC, 1 replies.
- MLLib Linear regression - posted by Sameer Tilak <ss...@live.com> on 2014/10/07 22:41:03 UTC, 4 replies.
- Re: Shuffle files - posted by SK <sk...@gmail.com> on 2014/10/07 23:11:36 UTC, 5 replies.
- anyone else seeing something like https://issues.apache.org/jira/browse/SPARK-3637 - posted by Steve Lewis <lo...@gmail.com> on 2014/10/07 23:45:35 UTC, 1 replies.
- Spark / Kafka connector - CDH5 distribution - posted by Abraham Jacob <ab...@gmail.com> on 2014/10/08 00:36:04 UTC, 3 replies.
- Storing shuffle files on a Tachyon - posted by Soumya Simanta <so...@gmail.com> on 2014/10/08 00:46:32 UTC, 0 replies.
- SparkStreaming program does not start - posted by spr <sp...@yarcdata.com> on 2014/10/08 00:47:52 UTC, 7 replies.
- bug with IPython notebook? - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2014/10/08 01:23:20 UTC, 0 replies.
- Record-at-a-time model for Spark Streaming - posted by Jianneng Li <ji...@berkeley.edu> on 2014/10/08 01:44:58 UTC, 1 replies.
- spark fold question - posted by chinchu <ch...@gmail.com> on 2014/10/08 02:49:24 UTC, 0 replies.
- Spark-Shell: OOM: GC overhead limit exceeded - posted by sranga <sr...@gmail.com> on 2014/10/08 03:57:59 UTC, 1 replies.
- Re: Spark Streaming : Could not compute split, block not found - posted by Tian Zhang <tz...@yahoo.com> on 2014/10/08 07:31:39 UTC, 1 replies.
- Support for Parquet V2 in ParquetTableSupport? - posted by Michael Allman <mi...@videoamp.com> on 2014/10/08 07:33:37 UTC, 2 replies.
- RE: Spark SQL question: why build hashtable for both sides in HashOuterJoin? - posted by Haopu Wang <HW...@qilinsoft.com> on 2014/10/08 08:04:55 UTC, 2 replies.
- Re: How to do broadcast join in SparkSQL - posted by Jianshi Huang <ji...@gmail.com> on 2014/10/08 08:18:37 UTC, 3 replies.
- Mosek Solver with Apache Spark - posted by Raghuveer Chanda <ra...@gmail.com> on 2014/10/08 10:01:46 UTC, 0 replies.
- SparkContext.wholeTextFiles() java.io.FileNotFoundException: File does not exist: - posted by ja...@centrum.cz on 2014/10/08 10:41:18 UTC, 5 replies.
- Interactive interface tool for spark - posted by "Dai, Kevin" <yu...@ebay.com> on 2014/10/08 11:15:32 UTC, 10 replies.
- org/I0Itec/zkclient/serialize/ZkSerializer ClassNotFound - posted by cjwebb <co...@hotmail.com> on 2014/10/08 12:20:18 UTC, 2 replies.
- spark-ec2 - HDFS doesn't start on AWS EC2 cluster - posted by Jan Warchoł <ja...@codilime.com> on 2014/10/08 13:46:37 UTC, 6 replies.
- foreachPartition: write to multiple files - posted by david <da...@free.fr> on 2014/10/08 14:16:04 UTC, 1 replies.
- sparksql connect remote hive cluster - posted by jamborta <ja...@gmail.com> on 2014/10/08 14:33:31 UTC, 1 replies.
- 我正在 LinkedIn (领英) 拓展职业人脉,希望能与您建立联系 - posted by yaochunnan <ya...@gmail.com> on 2014/10/08 14:41:55 UTC, 0 replies.
- Spark and tree data structures - posted by Silvina Caíno Lores <si...@gmail.com> on 2014/10/08 15:07:12 UTC, 0 replies.
- Error reading from Kafka - posted by Antonio Jesus Navarro <aj...@stratio.com> on 2014/10/08 16:03:45 UTC, 2 replies.
- Re: How could I start new spark cluster with hadoop2.0.2 - posted by st553 <st...@gmail.com> on 2014/10/08 17:14:32 UTC, 1 replies.
- Spark SQL HiveContext Projection Pushdown - posted by Anand Mohan <ch...@gmail.com> on 2014/10/08 19:41:52 UTC, 3 replies.
- meetup october 30-31st in SF - posted by Jeremy Freeman <fr...@gmail.com> on 2014/10/08 20:05:04 UTC, 0 replies.
- Broadcast Torrent fail - then the job dies - posted by Steve Lewis <lo...@gmail.com> on 2014/10/08 20:21:30 UTC, 2 replies.
- executors not created yarn-cluster mode - posted by jamborta <ja...@gmail.com> on 2014/10/08 21:00:19 UTC, 2 replies.
- Is there a way to look at RDD's lineage? Or debug a fault-tolerance error? - posted by Sung Hwan Chung <co...@cs.stanford.edu> on 2014/10/08 21:01:57 UTC, 5 replies.
- Spark on YARN driver memory allocation bug? - posted by Greg Hill <gr...@RACKSPACE.COM> on 2014/10/08 21:12:51 UTC, 4 replies.
- Dedup - posted by "Ge, Yao (Y.)" <yg...@ford.com> on 2014/10/08 21:37:48 UTC, 7 replies.
- Re: Running Spark cluster on local machine, cannot connect to master error - posted by rrussell25 <rr...@gmail.com> on 2014/10/08 22:29:24 UTC, 1 replies.
- protobuf error running spark on hadoop 2.4 - posted by Chuang Liu <li...@gmail.com> on 2014/10/08 22:40:29 UTC, 1 replies.
- Building pyspark with maven? - posted by Stephen Boesch <ja...@gmail.com> on 2014/10/08 23:01:06 UTC, 2 replies.
- coalesce with shuffle or repartition is not necessarily fault-tolerant - posted by Sung Hwan Chung <co...@cs.stanford.edu> on 2014/10/09 00:42:29 UTC, 6 replies.
- Spark-SQL: SchemaRDD - ClassCastException - posted by Ranga <sr...@gmail.com> on 2014/10/09 00:47:00 UTC, 6 replies.
- Re: GroupBy Key and then sort values with the group - posted by chinchu <ch...@gmail.com> on 2014/10/09 01:03:43 UTC, 2 replies.
- How to configure build.sbt for Spark 1.2.0 - posted by Arun Luthra <ar...@gmail.com> on 2014/10/09 02:35:07 UTC, 3 replies.
- Spark SQL parser bug? - posted by Mohammed Guller <mo...@glassbeam.com> on 2014/10/09 04:26:13 UTC, 10 replies.
- JavaPairDStream saveAsTextFile - posted by SA <sa...@gmail.com> on 2014/10/09 05:53:21 UTC, 2 replies.
- Re: java.io.IOException Error in task deserialization - posted by Sung Hwan Chung <co...@cs.stanford.edu> on 2014/10/09 06:13:40 UTC, 6 replies.
- Re: [MLlib] LogisticRegressionWithSGD and LogisticRegressionWithLBFGS converge with different weights. - posted by DB Tsai <db...@dbtsai.com> on 2014/10/09 11:23:25 UTC, 0 replies.
- DIMSUM item similarity tests - posted by Clive Cox <cl...@rummble.com> on 2014/10/09 11:33:27 UTC, 1 replies.
- Bug a spark task - posted by poiuytrez <gu...@databerries.com> on 2014/10/09 15:16:57 UTC, 2 replies.
- Spark SQL - Exception only when using cacheTable - posted by poiuytrez <gu...@databerries.com> on 2014/10/09 15:37:27 UTC, 6 replies.
- IOException and appcache FileNotFoundException in Spark 1.02 - posted by Ilya Ganelin <il...@gmail.com> on 2014/10/09 16:24:53 UTC, 5 replies.
- [SQL] Set Parquet block size? - posted by Pierre B <pi...@realimpactanalytics.com> on 2014/10/09 17:43:06 UTC, 1 replies.
- Help with using combineByKey - posted by HARIPRIYA AYYALASOMAYAJULA <ah...@gmail.com> on 2014/10/09 17:47:00 UTC, 9 replies.
- java.lang.OutOfMemoryError: Java heap space when running job via spark-submit - posted by Jaonary Rabarisoa <ja...@gmail.com> on 2014/10/09 18:00:16 UTC, 2 replies.
- spark-sql failing for some tables in hive - posted by sadhan <sa...@gmail.com> on 2014/10/09 18:25:16 UTC, 1 replies.
- Performance with activeSetOpt in GraphImpl.mapReduceTriplets() - posted by Cheuk Lam <ch...@hotmail.com> on 2014/10/09 19:12:14 UTC, 0 replies.
- MLUtil.kfold generates overlapped training and validation set? - posted by Nan Zhu <zh...@gmail.com> on 2014/10/09 20:05:04 UTC, 2 replies.
- Re: New API for TFIDF generation in Spark 1.1.0 - posted by nilesh <ni...@yahoo.com> on 2014/10/09 20:08:00 UTC, 1 replies.
- Does Ipython notebook work with spark? trivial example does not work. Re: bug with IPython notebook? - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2014/10/09 20:50:09 UTC, 4 replies.
- where are my python lambda functions run in yarn-client mode? - posted by esamanas <ev...@gmail.com> on 2014/10/09 21:14:59 UTC, 5 replies.
- How to save ReceiverInputDStream to Hadoop using saveAsNewAPIHadoopFile - posted by bdev <bu...@gmail.com> on 2014/10/09 21:40:31 UTC, 6 replies.
- Spark in cluster [ remote.EndpointWriter: AssociationError] - posted by Morbious <kn...@gmail.com> on 2014/10/09 22:13:41 UTC, 2 replies.
- Convert a org.apache.spark.sql.SchemaRDD[Row] to a RDD of Strings - posted by Soumya Simanta <so...@gmail.com> on 2014/10/09 22:22:47 UTC, 1 replies.
- Spark job doesn't clean after itself - posted by Rohit Pujari <rp...@hortonworks.com> on 2014/10/09 22:47:15 UTC, 1 replies.
- Debug Spark in Cluster Mode - posted by Rohit Pujari <rp...@hortonworks.com> on 2014/10/09 22:49:22 UTC, 1 replies.
- Memory Leaks? 1GB input file turns into 8GB memory use in JVM... from parsing CSV - posted by Aris <ar...@gmail.com> on 2014/10/09 23:25:23 UTC, 1 replies.
- One pass compute() to produce multiple RDDs - posted by Akshat Aranya <aa...@gmail.com> on 2014/10/09 23:55:25 UTC, 1 replies.
- Hung spark executors don't count toward worker memory limit - posted by Keith Simmons <ke...@pulse.io> on 2014/10/10 02:06:20 UTC, 2 replies.
- getting tweets for a specified handle - posted by SK <sk...@gmail.com> on 2014/10/10 02:22:58 UTC, 2 replies.
- Combined HDFS/Kafka Processing - posted by Tobias Pfeiffer <tg...@preferred.jp> on 2014/10/10 03:13:26 UTC, 0 replies.
- Processing order in Spark - posted by Tobias Pfeiffer <tg...@preferred.jp> on 2014/10/10 03:29:17 UTC, 3 replies.
- Unable to share Sql between HiveContext and JDBC Thrift Server - posted by Steve Arnold <sa...@gmail.com> on 2014/10/10 03:32:51 UTC, 1 replies.
- Executor and BlockManager memory size - posted by Larry Xiao <xi...@sjtu.edu.cn> on 2014/10/10 03:40:10 UTC, 2 replies.
- Spark SQL Percentile UDAF - posted by Anand Mohan <ch...@gmail.com> on 2014/10/10 03:48:08 UTC, 2 replies.
- How to benchmark SPARK apps? - posted by Theodore Si <sj...@gmail.com> on 2014/10/10 04:39:55 UTC, 2 replies.
- Re: How to benchmark SPARK applications? - posted by Theodore Si <sj...@gmail.com> on 2014/10/10 04:43:23 UTC, 0 replies.
- Intermittent checkpointing failure. - posted by Sung Hwan Chung <co...@cs.stanford.edu> on 2014/10/10 07:18:13 UTC, 0 replies.
- Is possible to invoke updateStateByKey twice on the same RDD - posted by qihong <qc...@pivotal.io> on 2014/10/10 07:19:56 UTC, 0 replies.
- How to patch sparkSQL on EC2? - posted by "Christos Kozanitis <Christos Kozanitis>" <ko...@berkeley.edu> on 2014/10/10 09:15:10 UTC, 1 replies.
- Delayed hotspot optimizations in Spark - posted by Alexey Romanchuk <al...@gmail.com> on 2014/10/10 09:49:34 UTC, 3 replies.
- Can I run examples on cluster? - posted by Theodore Si <sj...@gmail.com> on 2014/10/10 11:17:01 UTC, 5 replies.
- Re: ExecutorLostFailure (executor lost) - posted by Dhimant <dh...@gmail.com> on 2014/10/10 11:51:23 UTC, 1 replies.
- Re: Windowed Operations - posted by julyfire <he...@gmail.com> on 2014/10/10 12:06:56 UTC, 0 replies.
- Spark on Mesos Issue - Do I need to install Spark on Mesos slaves - posted by Bijoy Deb <bi...@gmail.com> on 2014/10/10 12:58:45 UTC, 2 replies.
- Disabling log4j in Spark-Shell on ec2 stopped working on Wednesday (Oct 8) - posted by Darin McBeath <dd...@yahoo.com.INVALID> on 2014/10/10 15:16:48 UTC, 1 replies.
- How does the Spark Accumulator work under the covers? - posted by "Areg Baghdasaryan (BLOOMBERG/ 731 LEX -)" <ab...@bloomberg.net> on 2014/10/10 16:37:34 UTC, 2 replies.
- Breaking the previous large-scale sort record with Spark - posted by Matei Zaharia <ma...@gmail.com> on 2014/10/10 16:54:16 UTC, 10 replies.
- SparkSQL LEFT JOIN problem - posted by invkrh <in...@gmail.com> on 2014/10/10 17:20:20 UTC, 2 replies.
- Spring hadoop begin to support spark - posted by guosxu <gu...@163.com> on 2014/10/10 17:32:56 UTC, 0 replies.
- Rdd repartitioning - posted by rapelly kartheek <ka...@gmail.com> on 2014/10/10 18:01:51 UTC, 0 replies.
- Spark job (not Spark streaming) doesn't delete un-needed checkpoints. - posted by Sung Hwan Chung <co...@cs.stanford.edu> on 2014/10/10 18:15:11 UTC, 0 replies.
- Application failure in yarn-cluster mode - posted by Christophe Préaud <ch...@kelkoo.com> on 2014/10/10 18:24:28 UTC, 2 replies.
- Re: Akka disassociation on Java SE Embedded - posted by bhusted <br...@gmail.com> on 2014/10/10 22:31:41 UTC, 1 replies.
- add Boulder-Denver Spark meetup to list on website - posted by Michael Oczkowski <Mi...@seeq.com> on 2014/10/10 22:52:57 UTC, 1 replies.
- Issue with java spark broadcast - posted by Jacob Maloney <jm...@conversantmedia.com> on 2014/10/10 23:15:47 UTC, 0 replies.
- Spark Streaming KafkaUtils Issue - posted by Abraham Jacob <ab...@gmail.com> on 2014/10/11 00:30:40 UTC, 9 replies.
- Running Example in local mode with input files - posted by maxpar <hl...@gmail.com> on 2014/10/11 01:01:37 UTC, 0 replies.
- how to find the sources for spark-project - posted by sadhan <sa...@gmail.com> on 2014/10/11 01:24:31 UTC, 2 replies.
- small bug in pyspark - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2014/10/11 02:24:55 UTC, 1 replies.
- Maryland Meetup - posted by Brian Husted <br...@gmail.com> on 2014/10/11 02:50:00 UTC, 0 replies.
- What if I port Spark from TCP/IP to RDMA? - posted by Theodore Si <sj...@gmail.com> on 2014/10/11 04:27:36 UTC, 1 replies.
- Re: Window comparison matching using the sliding window functionality: feasibility - posted by nitinkak001 <ni...@gmail.com> on 2014/10/11 04:42:02 UTC, 1 replies.
- RDD size in memory - Array[String] vs. case classes - posted by Liam Clarke-Hutchinson <li...@steelsky.co.nz> on 2014/10/11 07:29:17 UTC, 1 replies.
- How To Implement More Than One Subquery in Scala/Spark - posted by "Arthur.hk.chan@gmail.com" <ar...@gmail.com> on 2014/10/11 17:08:29 UTC, 3 replies.
- Blog post: An Absolutely Unofficial Way to Connect Tableau to SparkSQL (Spark 1.1) - posted by Denny Lee <de...@gmail.com> on 2014/10/11 18:46:44 UTC, 1 replies.
- Streams: How do RDDs get Aggregated? - posted by jay vyas <ja...@gmail.com> on 2014/10/11 21:30:11 UTC, 2 replies.
- How to convert a non-rdd data to rdd. - posted by rapelly kartheek <ka...@gmail.com> on 2014/10/12 08:15:08 UTC, 6 replies.
- ClasssNotFoundExeception was thrown while trying to save rdd - posted by Tao Xiao <xi...@gmail.com> on 2014/10/12 17:00:07 UTC, 4 replies.
- setting heap space - posted by Chengi Liu <ch...@gmail.com> on 2014/10/12 18:58:16 UTC, 5 replies.
- NullPointerException when deploying JAR to standalone cluster.. - posted by Jorge Simão <js...@gmail.com> on 2014/10/12 21:55:44 UTC, 0 replies.
- Spark in cluster and errors - posted by Morbious <kn...@gmail.com> on 2014/10/12 23:07:33 UTC, 3 replies.
- Nested Query using SparkSQL 1.1.0 - posted by shahab <sh...@gmail.com> on 2014/10/12 23:20:37 UTC, 3 replies.
- Confusion with webUI parameters - posted by rapelly kartheek <ka...@gmail.com> on 2014/10/13 08:26:09 UTC, 0 replies.
- SPARK-3106 fixed? - posted by Jianshi Huang <ji...@gmail.com> on 2014/10/13 10:15:02 UTC, 3 replies.
- Setting SparkSQL configuration - posted by Kevin Paul <ke...@gmail.com> on 2014/10/13 10:28:23 UTC, 1 replies.
- Issue with Spark Twitter Streaming - posted by "Jahagirdar, Madhu" <ma...@philips.com> on 2014/10/13 10:44:01 UTC, 0 replies.
- Is "Array Of Struct" supported in json RDDs? is it possible to query this? - posted by shahab <sh...@gmail.com> on 2014/10/13 11:08:25 UTC, 1 replies.
- Adding Supersonic to the "Powered by Spark" list - posted by Maya Bercovitch <ma...@supersonicads.com> on 2014/10/13 11:10:59 UTC, 0 replies.
- Configuration is not effective or configuration errors? - posted by pol <sw...@163.com> on 2014/10/13 11:19:30 UTC, 0 replies.
- Issue on running spark application in Yarn-cluster mode - posted by vishnu86 <vi...@yahoo.com> on 2014/10/13 12:38:03 UTC, 0 replies.
- Inconsistency of implementing accumulator in Java - posted by WonderfullDay <lg...@gmail.com> on 2014/10/13 15:17:16 UTC, 3 replies.
- Question about SVM mlllib... - posted by Alfonso Muñoz Muñoz <Al...@11paths.com> on 2014/10/13 16:03:33 UTC, 1 replies.
- Regarding java version requirement in spark 1.2.0 or upcoming releases - posted by twinkle sachdeva <tw...@gmail.com> on 2014/10/13 16:37:31 UTC, 1 replies.
- SparkSQL: StringType for numeric comparison - posted by invkrh <in...@gmail.com> on 2014/10/13 18:03:19 UTC, 3 replies.
- persist table schema in spark-sql - posted by Sadhan Sood <sa...@gmail.com> on 2014/10/13 18:20:28 UTC, 3 replies.
- read all parquet files in a directory in spark-sql - posted by Sadhan Sood <sa...@gmail.com> on 2014/10/13 18:21:51 UTC, 3 replies.
- SparkSQL: select syntax - posted by invkrh <in...@gmail.com> on 2014/10/13 18:58:11 UTC, 3 replies.
- Re: parquetFile and wilcards - posted by Nicholas Chammas <ni...@gmail.com> on 2014/10/13 19:23:09 UTC, 0 replies.
- S3 Bucket Access - posted by Ranga <sr...@gmail.com> on 2014/10/13 20:03:20 UTC, 13 replies.
- RowMatrix PCA out of heap space error - posted by Yang <te...@gmail.com> on 2014/10/13 21:10:00 UTC, 1 replies.
- Why is parsing a CSV incredibly wasteful with Java Heap memory? - posted by Aris <ar...@gmail.com> on 2014/10/13 22:12:53 UTC, 2 replies.
- Multipart uploads to Amazon S3 from Apache Spark - posted by Nick Chammas <ni...@gmail.com> on 2014/10/13 22:42:26 UTC, 2 replies.
- Re: mlib model viewing and saving - posted by Joseph Bradley <jo...@databricks.com> on 2014/10/13 22:52:26 UTC, 0 replies.
- distributing Scala Map datatypes to RDD - posted by "jon.g.massey" <jo...@gmail.com> on 2014/10/13 23:02:07 UTC, 3 replies.
- Problems building Spark for Hadoop 1.0.3 - posted by mildebrandt <ch...@woodenrhino.com> on 2014/10/13 23:39:48 UTC, 2 replies.
- SparkSQL IndexOutOfBoundsException when reading from Parquet - posted by Terry Siu <Te...@smartfocus.com> on 2014/10/13 23:57:42 UTC, 6 replies.
- Spark Cluster health check - posted by Tarun Garg <bi...@live.com> on 2014/10/14 00:01:10 UTC, 4 replies.
- Does SparkSQL work with custom defined SerDe? - posted by Chen Song <ch...@gmail.com> on 2014/10/14 00:04:01 UTC, 2 replies.
- How to construct graph in graphx - posted by Soumitra Siddharth Johri <so...@gmail.com> on 2014/10/14 00:22:44 UTC, 2 replies.
- Spark can't find jars - posted by Andy Srine <an...@gmail.com> on 2014/10/14 02:33:15 UTC, 12 replies.
- java.util.concurrent.TimeoutException: Futures timed out after [30 seconds] - posted by chepoo <sw...@163.com> on 2014/10/14 04:34:07 UTC, 3 replies.
- Can's create Kafka stream in spark shell - posted by Gary Zhao <ga...@gmail.com> on 2014/10/14 05:32:35 UTC, 5 replies.
- some more heap space error - posted by Chengi Liu <ch...@gmail.com> on 2014/10/14 05:39:20 UTC, 0 replies.
- Steps to connect BI Tools with Spark SQL using Thrift JDBC server - posted by Neeraj Garg02 <Ne...@infosys.com> on 2014/10/14 12:10:37 UTC, 0 replies.
- YARN deployment of Spark and Thrift JDBC server - posted by Neeraj Garg02 <Ne...@infosys.com> on 2014/10/14 13:31:07 UTC, 5 replies.
- Default spark.deploy.recoveryMode - posted by Priya Ch <le...@gmail.com> on 2014/10/14 13:33:46 UTC, 3 replies.
- Re: a hivectx insertinto issue - posted by valgrind_girl <12...@qq.com> on 2014/10/14 13:39:03 UTC, 1 replies.
- "Initial job has not accepted any resources" when launching SparkPi example on a worker. - posted by Theodore Si <sj...@gmail.com> on 2014/10/14 15:58:09 UTC, 3 replies.
- RE: Running Spark/YARN on AWS EMR - Issues finding file on hdfs? - posted by neeraj <ne...@infosys.com> on 2014/10/14 16:26:07 UTC, 0 replies.
- Re: Steps to connect BI Tools with Spark SQL using Thrift JDBC server - posted by Cheng Lian <li...@gmail.com> on 2014/10/14 16:34:56 UTC, 0 replies.
- TF-IDF in Spark 1.1.0 - posted by Burke Webster <bu...@gmail.com> on 2014/10/14 18:15:45 UTC, 2 replies.
- foreachPartition and task status - posted by Salman Haq <sa...@revmetrix.com> on 2014/10/14 18:34:32 UTC, 2 replies.
- spark 1.1.0/yarn hang - posted by tian zhang <tz...@yahoo.com.INVALID> on 2014/10/14 19:11:31 UTC, 1 replies.
- RDD Indexes and how to fetch all edges with a given label - posted by Soumitra Johri <so...@gmail.com> on 2014/10/14 20:46:39 UTC, 1 replies.
- Re: Spark Streaming - posted by st553 <st...@gmail.com> on 2014/10/14 20:51:36 UTC, 1 replies.
- spark sql union all is slow - posted by shuluster <sl...@turn.com> on 2014/10/14 20:51:49 UTC, 1 replies.
- SPARK_SUBMIT_CLASSPATH question - posted by Greg Hill <gr...@RACKSPACE.COM> on 2014/10/14 20:57:17 UTC, 2 replies.
- Spark Streaming: Sentiment Analysis of Twitter streams - posted by SK <sk...@gmail.com> on 2014/10/14 22:13:11 UTC, 5 replies.
- mllib CoordinateMatrix - posted by ll <du...@gmail.com> on 2014/10/14 22:18:32 UTC, 2 replies.
- Spark KMeans hangs at reduceByKey / collectAsMap - posted by Ray <ra...@outlook.com> on 2014/10/14 22:21:51 UTC, 10 replies.
- A question about streaming throughput - posted by danilopds <da...@gmail.com> on 2014/10/14 22:44:33 UTC, 2 replies.
- submitted uber-jar not seeing spark-assembly.jar at worker - posted by Tamas Sandor <ts...@gmail.com> on 2014/10/14 23:35:41 UTC, 2 replies.
- Submission to cluster fails (Spark SQL; NoSuchMethodError on SchemaRDD) - posted by Michael Campbell <mi...@gmail.com> on 2014/10/15 00:08:37 UTC, 1 replies.
- MLlib - Does LogisticRegressionModel.clearThreshold() no longer work? - posted by Aris <ar...@gmail.com> on 2014/10/15 00:14:01 UTC, 2 replies.
- rule engine based on spark - posted by salemi <al...@udo.edu> on 2014/10/15 00:21:42 UTC, 1 replies.
- Spark Streaming Empty DStream / RDD and reduceByKey - posted by Abraham Jacob <ab...@gmail.com> on 2014/10/15 01:11:52 UTC, 4 replies.
- How to I get at a SparkContext or better a JavaSparkContext from the middle of a function - posted by Steve Lewis <lo...@gmail.com> on 2014/10/15 01:47:50 UTC, 0 replies.
- Spark output to s3 extremely slow - posted by anny9699 <an...@gmail.com> on 2014/10/15 02:28:13 UTC, 2 replies.
- How to create Track per vehicle using spark RDD - posted by Manas Kar <ma...@gmail.com> on 2014/10/15 02:37:42 UTC, 3 replies.
- something about rdd.collect - posted by randylu <ra...@gmail.com> on 2014/10/15 03:25:47 UTC, 3 replies.
- kryos serializer - posted by dizzy5112 <da...@gmail.com> on 2014/10/15 03:40:09 UTC, 0 replies.
- How to write data into Hive partitioned Parquet table? - posted by Banias H <ba...@gmail.com> on 2014/10/15 03:44:23 UTC, 2 replies.
- pyspark - extract 1 field from string - posted by Chop <th...@att.net> on 2014/10/15 03:58:48 UTC, 1 replies.
- adding element into MutableList throws an error type mismatch - posted by Henry Hung <YT...@winbond.com> on 2014/10/15 08:40:59 UTC, 1 replies.
- Re: system.out.println with "--master yarn-cluster" - posted by vishnu86 <vi...@yahoo.com> on 2014/10/15 09:16:10 UTC, 0 replies.
- Unit testing jar request - posted by Jean Charles Jabouille <je...@kelkoo.com> on 2014/10/15 10:14:24 UTC, 0 replies.
- Spark on secure HDFS - posted by Erik van oosten <e....@grons.nl.INVALID> on 2014/10/15 10:30:56 UTC, 0 replies.
- Spark Concepts - posted by nsareen <ns...@gmail.com> on 2014/10/15 10:39:13 UTC, 2 replies.
- Spark dev environment best practices - posted by poiuytrez <gu...@databerries.com> on 2014/10/15 11:22:32 UTC, 0 replies.
- [SparkSQL] Convert JavaSchemaRDD to SchemaRDD - posted by Earthson <Ea...@gmail.com> on 2014/10/15 11:50:43 UTC, 2 replies.
- Re: How to make operation like cogrop() , groupbykey() on pair RDD = [ [ ], [ ] , [ ] ] - posted by Gen <ge...@gmail.com> on 2014/10/15 13:53:49 UTC, 1 replies.
- Re: jsonRDD: NoSuchMethodError - posted by Michael Campbell <mi...@gmail.com> on 2014/10/15 14:45:31 UTC, 0 replies.
- How to add HBase dependencies and conf with spark-submit? - posted by Fengyun RAO <ra...@gmail.com> on 2014/10/15 14:48:30 UTC, 4 replies.
- SparkSQL: set hive.metastore.warehouse.dir in CLI doesn't work - posted by Hao Ren <in...@gmail.com> on 2014/10/15 14:55:22 UTC, 1 replies.
- Problem executing Spark via JBoss application - posted by Mehdi Singer <me...@lampiris.be> on 2014/10/15 14:56:44 UTC, 4 replies.
- How to close resources shared in executor? - posted by Fengyun RAO <ra...@gmail.com> on 2014/10/15 15:09:37 UTC, 8 replies.
- Re: Spark Worker crashing and Master not seeing recovered worker - posted by Malte <ma...@gmail.com> on 2014/10/15 17:22:57 UTC, 0 replies.
- matrix operations? - posted by ll <du...@gmail.com> on 2014/10/15 17:46:55 UTC, 0 replies.
- RowMatrix.multiply() ? - posted by ll <du...@gmail.com> on 2014/10/15 17:50:48 UTC, 1 replies.
- Serialize/deserialize Naive Bayes model and index files - posted by jatinpreet <ja...@gmail.com> on 2014/10/15 18:57:46 UTC, 0 replies.
- spark-sql not coming up with Hive 0.10.0/CDH 4.6 - posted by Anurag Tangri <at...@groupon.com> on 2014/10/15 19:51:09 UTC, 3 replies.
- Exception while reading SendingConnection to ConnectionManagerId - posted by Jimmy Li <ji...@bluelabs.com> on 2014/10/15 21:38:41 UTC, 1 replies.
- Spark tasks still scheduled after Spark goes down - posted by pkl <lo...@gmail.com> on 2014/10/15 22:37:18 UTC, 0 replies.
- Getting the value from DStream[Int] - posted by SK <sk...@gmail.com> on 2014/10/16 01:49:11 UTC, 1 replies.
- Spark Streaming is slower than Spark - posted by Tarun Garg <bi...@live.com> on 2014/10/16 02:22:46 UTC, 0 replies.
- how to set log level of spark executor on YARN(using yarn-cluster mode) - posted by eric wong <wi...@gmail.com> on 2014/10/16 02:58:15 UTC, 2 replies.
- Spark Streaming: Invalid lambda deserialization error - posted by Chia-Chun Shih <ch...@gmail.com> on 2014/10/16 04:17:17 UTC, 0 replies.
- Play framework - posted by Mohammed Guller <mo...@glassbeam.com> on 2014/10/16 04:51:05 UTC, 12 replies.
- Sample codes for Spark streaming + Kafka + Scala + sbt? - posted by Gary Zhao <ga...@gmail.com> on 2014/10/16 05:12:17 UTC, 0 replies.
- Spark's shuffle file size keep increasing - posted by Haopu Wang <HW...@qilinsoft.com> on 2014/10/16 05:25:19 UTC, 0 replies.
- Problems with ZooKeeper and key canceled - posted by Malte <ma...@gmail.com> on 2014/10/16 08:22:30 UTC, 0 replies.
- Re: spark1.0 principal component analysis - posted by al123 <an...@hotmail.co.uk> on 2014/10/16 11:39:52 UTC, 1 replies.
- spark-default.conf description - posted by "Kuromatsu, Nobuyuki" <n....@jp.fujitsu.com> on 2014/10/16 11:41:55 UTC, 0 replies.
- GraphX Performance - posted by Jianwei Li <ja...@gmail.com> on 2014/10/16 14:18:09 UTC, 0 replies.
- Unit testing: Mocking out Spark classes - posted by Saket Kumar <sa...@bgch.co.uk> on 2014/10/16 15:07:36 UTC, 1 replies.
- Help required on exercise Data Exploratin using Spark SQL - posted by neeraj <ne...@infosys.com> on 2014/10/16 16:45:03 UTC, 3 replies.
- Spark SQL DDL, DML commands - posted by neeraj <ne...@infosys.com> on 2014/10/16 16:50:49 UTC, 2 replies.
- TaskNotSerializableException when running through Spark shell - posted by Akshat Aranya <aa...@gmail.com> on 2014/10/16 16:56:27 UTC, 1 replies.
- Folding an RDD in order - posted by Michael Misiewicz <mm...@gmail.com> on 2014/10/16 17:15:45 UTC, 9 replies.
- PySpark Error on Windows with sc.wholeTextFiles - posted by "Griffiths, Michael (NYC-RPM)" <Mi...@reprisemedia.com> on 2014/10/16 17:28:59 UTC, 1 replies.
- Running an action inside a loop across multiple RDDs + java.io.NotSerializableException - posted by _soumya_ <so...@gmail.com> on 2014/10/16 17:39:02 UTC, 3 replies.
- Standalone Apps and ClassNotFound - posted by Ashic Mahtab <as...@live.com> on 2014/10/16 17:40:25 UTC, 0 replies.
- ALS implicit error pyspark - posted by Gen <ge...@gmail.com> on 2014/10/16 18:53:45 UTC, 9 replies.
- Spark assembly for YARN/CDH5 - posted by Philip Ogren <ph...@oracle.com> on 2014/10/16 20:12:19 UTC, 2 replies.
- reverse an rdd - posted by ll <du...@gmail.com> on 2014/10/16 20:49:38 UTC, 3 replies.
- hi all - posted by Paweł Szulc <pa...@gmail.com> on 2014/10/16 21:58:21 UTC, 0 replies.
- How to name a DStream - posted by Soumitra Kumar <ku...@gmail.com> on 2014/10/16 22:00:00 UTC, 0 replies.
- Spark Bug? job fails to run when given options on spark-submit (but starts and fails without) - posted by Michael Campbell <mi...@gmail.com> on 2014/10/16 23:22:40 UTC, 2 replies.
- EC2 cluster set up and access to HBase in a different cluster - posted by freedafeng <fr...@yahoo.com> on 2014/10/16 23:46:42 UTC, 1 replies.
- Re: How to delete file/folder in amazon s3 using pyspark? - posted by freedafeng <fr...@yahoo.com> on 2014/10/16 23:51:42 UTC, 0 replies.
- scala: java.net.BindException? - posted by ll <du...@gmail.com> on 2014/10/16 23:51:45 UTC, 2 replies.
- Strange duplicates in data when scaling up - posted by Jacob Maloney <jm...@conversantmedia.com> on 2014/10/17 00:09:25 UTC, 1 replies.
- Spark streaming on data at rest. - posted by ameyc <am...@gmail.com> on 2014/10/17 00:10:27 UTC, 0 replies.
- Print dependency graph as DOT file - posted by Soumitra Kumar <ku...@gmail.com> on 2014/10/17 00:26:29 UTC, 0 replies.
- local class incompatible: stream classdesc serialVersionUID - posted by Pat Ferrel <pa...@occamsmachete.com> on 2014/10/17 00:35:42 UTC, 1 replies.
- Join with large data set - posted by Ankur Srivastava <an...@gmail.com> on 2014/10/17 00:57:36 UTC, 2 replies.
- Spark Hive Snappy Error - posted by "Arthur.hk.chan@gmail.com" <ar...@gmail.com> on 2014/10/17 01:13:27 UTC, 11 replies.
- Exception Logging - posted by "Ge, Yao (Y.)" <yg...@ford.com> on 2014/10/17 02:11:57 UTC, 1 replies.
- object in an rdd: serializable? - posted by ll <du...@gmail.com> on 2014/10/17 02:30:17 UTC, 0 replies.
- how to build spark 1.1.0 to include org.apache.commons.math3 ? - posted by Henry Hung <YT...@winbond.com> on 2014/10/17 03:57:44 UTC, 2 replies.
- Re: object in an rdd: serializable? - posted by Boromir Widas <vc...@gmail.com> on 2014/10/17 04:41:01 UTC, 1 replies.
- error when maven build spark 1.1.0 with message "You have 1 Scalastyle violation" - posted by Henry Hung <YT...@winbond.com> on 2014/10/17 07:05:27 UTC, 2 replies.
- rdd caching and use thereof - posted by Nathan Kronenfeld <nk...@oculusinfo.com> on 2014/10/17 08:46:21 UTC, 1 replies.
- how to submit multiple jar files when using spark-submit script in shell? - posted by eric wong <wi...@gmail.com> on 2014/10/17 08:51:20 UTC, 2 replies.
- key class requirement for PairedRDD ? - posted by Jaonary Rabarisoa <ja...@gmail.com> on 2014/10/17 08:58:47 UTC, 2 replies.
- MLlib and pyspark features - posted by poiuytrez <gu...@databerries.com> on 2014/10/17 09:58:36 UTC, 1 replies.
- Re: MLlib linking error Mac OS X - posted by poiuytrez <gu...@databerries.com> on 2014/10/17 11:00:39 UTC, 4 replies.
- why do RDD's partitions migrate between worker nodes in different iterations - posted by randylu <ra...@gmail.com> on 2014/10/17 11:32:04 UTC, 1 replies.
- Re: How does reading the data from Amazon S3 works? - posted by ja...@centrum.cz on 2014/10/17 11:58:56 UTC, 0 replies.
- How to assure that there will be run only one map per cluster node? - posted by ja...@centrum.cz on 2014/10/17 12:06:22 UTC, 0 replies.
- Gracefully stopping a Spark Streaming application - posted by Massimiliano Tomassi <ma...@gmail.com> on 2014/10/17 12:20:12 UTC, 1 replies.
- What's wrong with my spark filter? I get "org.apache.spark.SparkException: Task not serializable" - posted by shahab <sh...@gmail.com> on 2014/10/17 12:37:23 UTC, 2 replies.
- Optimizing pairwise similarity computation or how to avoid RDD.cartesian operation ? - posted by Jaonary Rabarisoa <ja...@gmail.com> on 2014/10/17 12:43:03 UTC, 3 replies.
- Designed behavior when master is unreachable. - posted by preeze <et...@gmail.com> on 2014/10/17 12:51:55 UTC, 2 replies.
- Regarding using spark sql with yarn - posted by twinkle sachdeva <tw...@gmail.com> on 2014/10/17 12:53:18 UTC, 0 replies.
- What is akka-actor_2.10-2.2.3-shaded-protobuf.jar? - posted by "Ruebenacker, Oliver A" <Ol...@altisource.com> on 2014/10/17 14:56:33 UTC, 1 replies.
- bug with MapPartitions? - posted by davidkl <da...@hotmail.com> on 2014/10/17 16:33:34 UTC, 3 replies.
- Attaching schema to RDD created from Parquet file - posted by Akshat Aranya <aa...@gmail.com> on 2014/10/17 18:30:04 UTC, 1 replies.
- Spark/HIVE Insert Into values Error - posted by "Arthur.hk.chan@gmail.com" <ar...@gmail.com> on 2014/10/17 19:33:15 UTC, 2 replies.
- Re: how can I make the sliding window in Spark Streaming driven by data timestamp instead of absolute time - posted by st553 <st...@gmail.com> on 2014/10/17 19:51:46 UTC, 0 replies.
- complexity of each action / transformation - posted by ll <du...@gmail.com> on 2014/10/17 20:08:47 UTC, 1 replies.
- spark best practices / guidelines? - posted by ll <du...@gmail.com> on 2014/10/17 20:13:15 UTC, 0 replies.
- Unable to connect to Spark thrift JDBC server with pluggable authentication - posted by Jenny Zhao <li...@gmail.com> on 2014/10/17 22:14:22 UTC, 1 replies.
- Transforming the Dstream vs transforming each RDDs in the Dstream. - posted by Gerard Maas <ge...@gmail.com> on 2014/10/17 23:20:17 UTC, 8 replies.
- How to write a RDD into One Local Existing File? - posted by Parthus <pe...@gmail.com> on 2014/10/18 00:46:25 UTC, 3 replies.
- mllib.linalg.Vectors vs Breeze? - posted by ll <du...@gmail.com> on 2014/10/18 01:34:46 UTC, 4 replies.
- PySpark joins fail - please help - posted by Russell Jurney <ru...@gmail.com> on 2014/10/18 02:01:46 UTC, 3 replies.
- input split size - posted by Larry Liu <la...@gmail.com> on 2014/10/18 02:27:55 UTC, 6 replies.
- How to disable input split - posted by Larry Liu <la...@gmail.com> on 2014/10/18 02:35:21 UTC, 2 replies.
- How to not write empty RDD partitions in RDD.saveAsTextFile() - posted by ja...@centrum.cz on 2014/10/18 14:30:30 UTC, 1 replies.
- a hivectx insertinto issue-can inertinto function be applied to a hive table - posted by valgrind_girl <12...@qq.com> on 2014/10/18 15:50:10 UTC, 1 replies.
- Fwd: Oryx + Spark mllib - posted by Debasish Das <de...@gmail.com> on 2014/10/18 17:49:14 UTC, 2 replies.
- Spark speed performance - posted by ja...@centrum.cz on 2014/10/18 21:07:54 UTC, 3 replies.
- What executes on worker and what executes on driver side - posted by Saurabh Wadhawan <Sa...@guavus.com> on 2014/10/18 22:24:18 UTC, 4 replies.
- why fetch failed - posted by marylucy <qa...@hotmail.com> on 2014/10/19 03:22:19 UTC, 4 replies.
- Spark SQL on XML files - posted by gtinside <gt...@gmail.com> on 2014/10/19 06:08:18 UTC, 1 replies.
- Submissions open for Spark Summit East 2015 - posted by Matei Zaharia <ma...@gmail.com> on 2014/10/19 06:52:13 UTC, 1 replies.
- why does driver connects to master fail ? - posted by randylu <ra...@gmail.com> on 2014/10/19 11:14:16 UTC, 6 replies.
- scala.MatchError: class java.sql.Timestamp - posted by "Ge, Yao (Y.)" <yg...@ford.com> on 2014/10/19 16:16:58 UTC, 4 replies.
- Error while running Streaming examples - no snappyjava in java.library.path - posted by bdev <bu...@gmail.com> on 2014/10/19 18:48:56 UTC, 2 replies.
- Using SVMWithSGD model to predict - posted by npomfret <ni...@snowmonkey.co.uk> on 2014/10/19 19:53:22 UTC, 5 replies.
- Is Spark the right tool? - posted by kc66 <ka...@yahoo.com> on 2014/10/19 23:34:39 UTC, 2 replies.
- mlib model build and low CPU usage - posted by Nick Pomfret <ni...@snowmonkey.co.uk> on 2014/10/20 00:13:57 UTC, 0 replies.
- Spark Streaming scheduling control - posted by davidkl <da...@hotmail.com> on 2014/10/20 00:19:58 UTC, 3 replies.
- Re: Upgrade to Spark 1.1.0? - posted by Pat Ferrel <pa...@occamsmachete.com> on 2014/10/20 01:36:32 UTC, 1 replies.
- All executors run on just a few nodes - posted by Tao Xiao <xi...@gmail.com> on 2014/10/20 05:22:02 UTC, 3 replies.
- default parallelism bug? - posted by Kevin Jung <it...@samsung.com> on 2014/10/20 06:38:35 UTC, 3 replies.
- Re: checkpoint and not running out of disk space - posted by sivarani <wh...@gmail.com> on 2014/10/20 08:27:38 UTC, 0 replies.
- New research using Spark: Unified Secure On-/Off-line Analytics - posted by Peter Coetzee <pe...@coetzee.org> on 2014/10/20 09:59:31 UTC, 0 replies.
- What does KryoException: java.lang.NegativeArraySizeException mean? - posted by Fengyun RAO <ra...@gmail.com> on 2014/10/20 10:44:40 UTC, 3 replies.
- Re: How to aggregate data in Apach Spark - posted by Gen <ge...@gmail.com> on 2014/10/20 11:52:47 UTC, 1 replies.
- RDD to Multiple Tables SparkSQL - posted by critikaled <is...@gmail.com> on 2014/10/20 13:02:34 UTC, 2 replies.
- java.lang.OutOfMemoryError: Requested array size exceeds VM limit - posted by Arian Pasquali <ar...@arianpasquali.com> on 2014/10/20 13:12:08 UTC, 5 replies.
- How to show RDD size - posted by marylucy <qa...@hotmail.com> on 2014/10/20 15:19:39 UTC, 3 replies.
- Spark-jobserver for java apps - posted by Tomer Benyamini <to...@gmail.com> on 2014/10/20 18:08:29 UTC, 0 replies.
- Spark Streaming occasionally hangs after processing first batch - posted by t1ny <wb...@gmail.com> on 2014/10/20 18:29:36 UTC, 0 replies.
- How to emit multiple keys for the same value? - posted by HARIPRIYA AYYALASOMAYAJULA <ah...@gmail.com> on 2014/10/20 18:31:29 UTC, 2 replies.
- SparkSQL - TreeNodeException for unresolved attributes - posted by Terry Siu <Te...@smartfocus.com> on 2014/10/20 19:33:11 UTC, 3 replies.
- java.lang.OutOfMemoryError: Java heap space during reduce operation - posted by ayandas84 <ay...@gmail.com> on 2014/10/20 19:47:56 UTC, 0 replies.
- CustomReceiver : ActorOf vs ActorSelection - posted by vvarma <vi...@gmail.com> on 2014/10/20 20:34:53 UTC, 0 replies.
- spark sql not able to find classes with --jars option - posted by sadhan <sa...@gmail.com> on 2014/10/20 20:58:15 UTC, 1 replies.
- Re: RDD Cleanup - posted by maihung <hu...@gmail.com> on 2014/10/20 21:44:35 UTC, 0 replies.
- Saving very large data sets as Parquet on S3 - posted by Daniel Mahler <dm...@gmail.com> on 2014/10/20 21:45:14 UTC, 2 replies.
- Getting spark to use more than 4 cores on Amazon EC2 - posted by Daniel Mahler <dm...@gmail.com> on 2014/10/20 21:53:45 UTC, 10 replies.
- example.jar caused exception when running pi.py, spark 1.1 - posted by freedafeng <fr...@yahoo.com> on 2014/10/20 22:43:57 UTC, 1 replies.
- saveasSequenceFile with codec and compression type - posted by gpatcham <gp...@gmail.com> on 2014/10/20 23:41:29 UTC, 1 replies.
- How do you write a JavaRDD into a single file - posted by Steve Lewis <lo...@gmail.com> on 2014/10/21 00:13:39 UTC, 5 replies.
- worker_instances vs worker_cores - posted by anny9699 <an...@gmail.com> on 2014/10/21 00:21:18 UTC, 2 replies.
- spark sql: timestamp in json - fails - posted by tridib <tr...@live.com> on 2014/10/21 01:34:11 UTC, 11 replies.
- add external jars to spark-shell - posted by Chuang Liu <li...@gmail.com> on 2014/10/21 02:28:23 UTC, 1 replies.
- Help with an error - posted by Sunandan Chakraborty <sc...@nyu.edu> on 2014/10/21 04:08:20 UTC, 0 replies.
- Spark SQL : sqlContext.jsonFile date type detection and perforormance - posted by tridib <tr...@live.com> on 2014/10/21 04:56:45 UTC, 1 replies.
- Convert Iterable to RDD - posted by "Dai, Kevin" <yu...@ebay.com> on 2014/10/21 04:58:19 UTC, 2 replies.
- spark sql: join sql fails after sqlCtx.cacheTable() - posted by tridib <tr...@live.com> on 2014/10/21 05:45:56 UTC, 6 replies.
- Does start-slave.sh use the values in conf/slaves to launch a worker in Spark standalone cluster mode - posted by Soumya Simanta <so...@gmail.com> on 2014/10/21 06:55:12 UTC, 1 replies.
- Spark MLLIB Decision Tree - ArrayIndexOutOfBounds Exception - posted by lokeshkumar <lo...@dataken.net> on 2014/10/21 11:42:52 UTC, 4 replies.
- [SQL] Is RANK function supposed to work in SparkSQL 1.1.0? - posted by Pierre B <pi...@realimpactanalytics.com> on 2014/10/21 12:00:39 UTC, 2 replies.
- Getting Spark SQL talking to Sql Server - posted by Ashic Mahtab <as...@live.com> on 2014/10/21 12:26:42 UTC, 2 replies.
- Custom s3 endpoint - posted by bobrik <ib...@gmail.com> on 2014/10/21 12:31:38 UTC, 0 replies.
- create a Row Matrix - posted by viola <vi...@siemens.com> on 2014/10/21 14:34:48 UTC, 3 replies.
- Spark Cassandra connector issue - posted by Ankur Srivastava <an...@gmail.com> on 2014/10/21 19:27:07 UTC, 1 replies.
- disk-backing pyspark rdds? - posted by Eric Jonas <jo...@eecs.berkeley.edu> on 2014/10/21 19:45:10 UTC, 0 replies.
- stage failure: Task 0 in stage 0.0 failed 4 times - posted by freedafeng <fr...@yahoo.com> on 2014/10/21 20:04:02 UTC, 1 replies.
- How to set hadoop native library path in spark-1.1 - posted by Pradeep Ch <pr...@gmail.com> on 2014/10/21 20:44:16 UTC, 1 replies.
- Class not found - posted by Pat Ferrel <pa...@occamsmachete.com> on 2014/10/21 21:16:01 UTC, 2 replies.
- How to calculate percentiles with Spark? - posted by sparkuser <us...@gmail.com> on 2014/10/21 21:22:40 UTC, 1 replies.
- Spark-Submit Python along with JAR - posted by TJ Klein <TJ...@gmail.com> on 2014/10/21 21:57:02 UTC, 0 replies.
- spark sql: sqlContext.jsonFile date type detection and perforormance - posted by tridib <tr...@live.com> on 2014/10/21 22:00:20 UTC, 2 replies.
- Primitive arrays in Spark - posted by Akshat Aranya <aa...@gmail.com> on 2014/10/21 22:08:36 UTC, 1 replies.
- MLLib libsvm format - posted by Sameer Tilak <ss...@live.com> on 2014/10/21 22:10:14 UTC, 2 replies.
- Usage of spark-ec2: how to deploy a revised version of spark 1.1.0? - posted by freedafeng <fr...@yahoo.com> on 2014/10/21 22:25:34 UTC, 4 replies.
- com.esotericsoftware.kryo.KryoException: Buffer overflow. - posted by nitinkak001 <ni...@gmail.com> on 2014/10/21 23:30:27 UTC, 1 replies.
- SchemaRDD.where clause error - posted by Kevin Paul <ke...@gmail.com> on 2014/10/21 23:40:24 UTC, 1 replies.
- buffer overflow when running Kmeans - posted by Yang <te...@gmail.com> on 2014/10/21 23:44:43 UTC, 1 replies.
- Spark - HiveContext - Unstructured Json - posted by Harivardan Jayaraman <hj...@kabaminc.com> on 2014/10/21 23:56:52 UTC, 2 replies.
- How to read BZ2 XML file in Spark? - posted by John Roberts <in...@gmail.com> on 2014/10/22 00:02:57 UTC, 1 replies.
- spark ui redirecting to port 8100 - posted by sadhan <sa...@gmail.com> on 2014/10/22 00:29:27 UTC, 1 replies.
- Spark Streaming - How to write RDD's in same directory ? - posted by Shailesh Birari <sb...@wynyardgroup.com> on 2014/10/22 00:51:42 UTC, 2 replies.
- Using the DataStax Cassandra Connector from PySpark - posted by Mike Sukmanowsky <mi...@gmail.com> on 2014/10/22 01:02:48 UTC, 0 replies.
- Spark Streaming Applications - posted by Saiph Kappa <sa...@gmail.com> on 2014/10/22 01:33:02 UTC, 5 replies.
- spark 1.1.0 RDD and Calliope 1.1.0-CTP-U2-H2 - posted by Tian Zhang <tz...@yahoo.com> on 2014/10/22 01:33:31 UTC, 0 replies.
- Re: spark-ec2 script with VPC - posted by Mike Jennings <mv...@gmail.com> on 2014/10/22 02:09:31 UTC, 0 replies.
- Num-executors and executor-cores overwritten by defaults - posted by Ilya Ganelin <il...@gmail.com> on 2014/10/22 06:35:21 UTC, 0 replies.
- spark sql query optimization , and decision tree building - posted by sanath kumar <sa...@gmail.com> on 2014/10/22 06:58:06 UTC, 3 replies.
- Subscription request - posted by Sathya <sa...@gmail.com> on 2014/10/22 08:23:45 UTC, 0 replies.
- Re: rdd.checkpoint() : driver-side lineage?? - posted by harsh2005_7 <ha...@yahoo.com> on 2014/10/22 09:20:23 UTC, 0 replies.
- SchemaRDD Convert - posted by "Dai, Kevin" <yu...@ebay.com> on 2014/10/22 10:16:47 UTC, 2 replies.
- Python vs Scala performance - posted by Marius Soutier <mp...@gmail.com> on 2014/10/22 12:00:41 UTC, 13 replies.
- confused about memory usage in spark - posted by Darin McBeath <dd...@yahoo.com.INVALID> on 2014/10/22 16:21:28 UTC, 2 replies.
- Re: - posted by Ted Yu <yu...@gmail.com> on 2014/10/22 16:45:36 UTC, 0 replies.
- Spark as key/value store? - posted by Hajime Takase <pl...@gmail.com> on 2014/10/22 16:51:27 UTC, 2 replies.
- Rdd of Rdds - posted by Tomer Benyamini <to...@gmail.com> on 2014/10/22 16:58:41 UTC, 3 replies.
- Sharing spark context across multiple spark sql cli initializations - posted by Sadhan Sood <sa...@gmail.com> on 2014/10/22 20:10:48 UTC, 2 replies.
- Multitenancy in Spark - within/across spark context - posted by Ashwin Shankar <as...@gmail.com> on 2014/10/22 20:47:21 UTC, 7 replies.
- Shuffle issues in the current master - posted by DB Tsai <db...@dbtsai.com> on 2014/10/22 20:54:01 UTC, 6 replies.
- streaming join sliding windows - posted by Josh J <jo...@gmail.com> on 2014/10/22 21:29:39 UTC, 1 replies.
- Setting only master heap - posted by Keith Simmons <ke...@pulse.io> on 2014/10/22 22:46:17 UTC, 4 replies.
- ERROR ConnectionManager: Corresponding SendingConnection to ConnectionManagerId - posted by "Arthur.hk.chan@gmail.com" <ar...@gmail.com> on 2014/10/22 23:14:30 UTC, 2 replies.
- Does SQLSpark support Hive built in functions? - posted by shahab <sh...@gmail.com> on 2014/10/22 23:20:04 UTC, 1 replies.
- Solving linear equations - posted by Martin Enzinger <ma...@gmail.com> on 2014/10/23 01:15:24 UTC, 3 replies.
- Spark: Order by Failed, java.lang.NullPointerException - posted by "Arthur.hk.chan@gmail.com" <ar...@gmail.com> on 2014/10/23 02:29:39 UTC, 3 replies.
- version mismatch issue with spark breeze vector - posted by Yang <te...@gmail.com> on 2014/10/23 02:39:57 UTC, 1 replies.
- SparkSQL , best way to divide data into partitions? - posted by raymond <rg...@gmail.com> on 2014/10/23 04:03:05 UTC, 1 replies.
- hive timestamp column always returns null - posted by tridib <tr...@live.com> on 2014/10/23 05:53:54 UTC, 1 replies.
- Workaround for SPARK-1931 not compiling - posted by Arpit Kumar <ar...@gmail.com> on 2014/10/23 06:18:55 UTC, 2 replies.
- scalac crash when compiling DataTypeConversions.scala - posted by Ryan Williams <ry...@gmail.com> on 2014/10/23 07:00:43 UTC, 8 replies.
- About "Memory usage" in the Spark UI - posted by Haopu Wang <HW...@qilinsoft.com> on 2014/10/23 07:07:42 UTC, 4 replies.
- how to run a dev spark project without fully rebuilding the fat jar ? - posted by Yang <te...@gmail.com> on 2014/10/23 07:29:19 UTC, 2 replies.
- Is Spark streaming suitable for our architecture? - posted by Albert Vila <al...@augure.com> on 2014/10/23 09:07:47 UTC, 3 replies.
- Dynamically loaded Spark-stream consumer - posted by Jianshi Huang <ji...@gmail.com> on 2014/10/23 09:36:08 UTC, 0 replies.
- How to access objects declared and initialized outside the call() method of JavaRDD - posted by Localhost shell <un...@gmail.com> on 2014/10/23 09:46:06 UTC, 7 replies.
- Which is better? One spark app listening to 10 topics vs. 10 spark apps each listening to 1 topic - posted by Jianshi Huang <ji...@gmail.com> on 2014/10/23 09:49:47 UTC, 3 replies.
- NoClassDefFoundError on ThreadFactoryBuilder in Intellij - posted by Stephen Boesch <ja...@gmail.com> on 2014/10/23 10:43:27 UTC, 3 replies.
- SparkSQL and columnar data - posted by Marius Soutier <mp...@gmail.com> on 2014/10/23 11:47:00 UTC, 0 replies.
- Spark Cassandra Connector proper usage - posted by Ashic Mahtab <as...@live.com> on 2014/10/23 13:21:03 UTC, 3 replies.
- what's the best way to initialize an executor? - posted by Darin McBeath <dd...@yahoo.com.INVALID> on 2014/10/23 14:41:47 UTC, 1 replies.
- Aggregation Error: org.apache.spark.sql.catalyst.errors.package$TreeNodeException: - posted by "Arthur.hk.chan@gmail.com" <ar...@gmail.com> on 2014/10/23 14:54:38 UTC, 2 replies.
- unable to make a custom class as a key in a pairrdd - posted by Jaonary Rabarisoa <ja...@gmail.com> on 2014/10/23 21:52:12 UTC, 4 replies.
- JavaHiveContext class not found error. Help!! - posted by nitinkak001 <ni...@gmail.com> on 2014/10/23 23:53:35 UTC, 1 replies.
- spark is running extremely slow with larger data set, like 2G - posted by xuhongnever <xu...@gmail.com> on 2014/10/24 00:12:35 UTC, 5 replies.
- Spark 1.1.0 and Hive 0.12.0 Compatibility Issue - posted by Ar...@gmail.com, ar...@gmail.com on 2014/10/24 00:17:43 UTC, 3 replies.
- Exceptions not caught? - posted by ankits <an...@gmail.com> on 2014/10/24 00:40:43 UTC, 7 replies.
- Re: spark-ec2: deploy customized spark version - posted by freedafeng <fr...@yahoo.com> on 2014/10/24 01:12:51 UTC, 0 replies.
- Spark using HDFS data [newb] - posted by matan <de...@gmail.com> on 2014/10/24 03:12:54 UTC, 1 replies.
- Problem packing spark-assembly jar - posted by Yana Kadiyska <ya...@gmail.com> on 2014/10/24 03:22:40 UTC, 2 replies.
- Re: Local Dev Env with Mesos + Spark Streaming on Docker: Can't submit jobs. - posted by Svend <sv...@gmail.com> on 2014/10/24 03:34:19 UTC, 0 replies.
- Re: Spark 1.0.0 on yarn cluster problem - posted by firemonk9 <dh...@gmail.com> on 2014/10/24 04:25:46 UTC, 2 replies.
- large benchmark sets for MLlib experiments - posted by Chih-Jen Lin <cj...@csie.ntu.edu.tw> on 2014/10/24 04:33:50 UTC, 0 replies.
- Memory requirement of using Spark - posted by "jian.t" <ji...@gmail.com> on 2014/10/24 05:17:12 UTC, 2 replies.
- submit query to spark cluster using spark-sql - posted by tridib <tr...@live.com> on 2014/10/24 07:30:35 UTC, 1 replies.
- How can I set the IP a worker use? - posted by Theodore Si <sj...@gmail.com> on 2014/10/24 09:04:06 UTC, 2 replies.
- Is SparkSQL + JDBC server a good approach for caching? - posted by ankits <an...@gmail.com> on 2014/10/24 10:05:46 UTC, 7 replies.
- Broadcast failure with variable size of ~ 500mb with "key already cancelled ?" - posted by htailor <he...@live.co.uk> on 2014/10/24 13:31:59 UTC, 0 replies.
- Measuring execution time - posted by shahab <sh...@gmail.com> on 2014/10/24 14:51:31 UTC, 1 replies.
- scala.collection.mutable.ArrayOps$ofRef$.length$extension since Spark 1.1.0 - posted by Marius Soutier <mp...@gmail.com> on 2014/10/24 15:35:54 UTC, 2 replies.
- Spark doesn't retry task while writing to HDFS - posted by Aniket Bhatnagar <an...@gmail.com> on 2014/10/24 15:45:52 UTC, 0 replies.
- spark-submit memory too larger - posted by marylucy <qa...@hotmail.com> on 2014/10/24 17:07:55 UTC, 0 replies.
- PySpark problem with textblob from NLTK used in map - posted by ja...@centrum.cz on 2014/10/24 17:24:07 UTC, 2 replies.
- Job cancelled because SparkContext was shut down - failures! - posted by Sadhan Sood <sa...@gmail.com> on 2014/10/24 18:15:59 UTC, 1 replies.
- Re: spark-submit memory too larger - posted by Sameer Farooqui <sa...@databricks.com> on 2014/10/24 19:27:04 UTC, 1 replies.
- [Spark SQL] Setting variables - posted by Yana Kadiyska <ya...@gmail.com> on 2014/10/24 20:32:56 UTC, 1 replies.
- Under which user is the program run on slaves? - posted by ja...@centrum.cz on 2014/10/24 21:08:09 UTC, 1 replies.
- Re: How to use FlumeInputDStream in spark cluster? - posted by BigDataUser <su...@yahoo.com> on 2014/10/24 21:41:00 UTC, 0 replies.
- Function returning multiple Values - problem with using "if-else" - posted by HARIPRIYA AYYALASOMAYAJULA <ah...@gmail.com> on 2014/10/24 21:52:15 UTC, 2 replies.
- Re: Spark LIBLINEAR - posted by "k.tham" <ke...@gmail.com> on 2014/10/24 23:09:40 UTC, 9 replies.
- Re: Spark using non-HDFS data on a distributed file system cluster - posted by matan <de...@gmail.com> on 2014/10/24 23:35:47 UTC, 0 replies.
- docker spark 1.1.0 cluster - posted by Josh J <jo...@gmail.com> on 2014/10/24 23:38:03 UTC, 2 replies.
- How to set persistence level of graph in GraphX in spark 1.0.0 - posted by Arpit Kumar <ar...@gmail.com> on 2014/10/25 05:26:34 UTC, 5 replies.
- Spark Worker node accessing Hive metastore - posted by ken <ke...@verizon.com> on 2014/10/25 07:32:00 UTC, 2 replies.
- Read a TextFile(1 record contains 4 lines) into a RDD - posted by Parthus <pe...@gmail.com> on 2014/10/25 09:57:12 UTC, 1 replies.
- NullPointerException when using Accumulators on cluster - posted by "octavian.ganea" <oc...@inf.ethz.ch> on 2014/10/25 20:38:21 UTC, 0 replies.
- Accumulators : Task not serializable: java.io.NotSerializableException: org.apache.spark.SparkContext - posted by "octavian.ganea" <oc...@inf.ethz.ch> on 2014/10/25 22:14:27 UTC, 3 replies.
- Bug in Accumulators... - posted by "octavian.ganea" <oc...@inf.ethz.ch> on 2014/10/25 22:41:03 UTC, 3 replies.
- Asymmetric spark cluster memory utilization - posted by Manas Kar <ma...@gmail.com> on 2014/10/26 02:21:34 UTC, 0 replies.
- Spark as Relational Database - posted by Peter Wolf <op...@gmail.com> on 2014/10/26 04:18:10 UTC, 13 replies.
- HiveSQL percentile is query slow - posted by Kevin Paul <ke...@gmail.com> on 2014/10/26 04:48:54 UTC, 0 replies.
- Create table error from Hive in spark-assembly-1.0.2.jar - posted by Jacob Chacko - Catalyst Consulting <ja...@catalystconsulting.be> on 2014/10/26 08:36:33 UTC, 1 replies.
- Implement Count by Minute in Spark Streaming - posted by Ji ZHANG <zh...@gmail.com> on 2014/10/26 12:03:51 UTC, 3 replies.
- what classes are needed to register in KryoRegistrator, e.g. Row? - posted by Fengyun RAO <ra...@gmail.com> on 2014/10/26 16:43:58 UTC, 1 replies.
- How do you use the thrift-server to get data from a Spark program? - posted by Edward Sargisson <ej...@gmail.com> on 2014/10/26 20:16:02 UTC, 1 replies.
- Job Offering - Spark/Scala - Orange County, CA - posted by Marc Bir <mb...@compellon.com> on 2014/10/26 22:13:21 UTC, 0 replies.
- Spark optimization - posted by Morbious <kn...@gmail.com> on 2014/10/26 22:44:50 UTC, 1 replies.
- Spark SQL configuration - posted by "Pagliari, Roberto" <rp...@appcomsci.com> on 2014/10/27 00:59:39 UTC, 3 replies.
- Spark 1.1.0 ClassNotFoundException issue when submit with multi jars using CLUSTER MODE - posted by xing_bing <bi...@cekasp.cn> on 2014/10/27 03:17:14 UTC, 0 replies.
- Spark SQL Exists Clause - posted by agg212 <al...@brown.edu> on 2014/10/27 05:01:42 UTC, 1 replies.
- Re: RDD to DStream - posted by Jianshi Huang <ji...@gmail.com> on 2014/10/27 06:42:02 UTC, 9 replies.
- Spark streaming update/restart gracefully - posted by Jianshi Huang <ji...@gmail.com> on 2014/10/27 06:56:56 UTC, 0 replies.
- Python code crashing on ReduceByKey if I return custom class object - posted by sid <si...@gmail.com> on 2014/10/27 07:57:01 UTC, 3 replies.
- Ephemeral Hive metastore for HiveContext? - posted by Jianshi Huang <ji...@gmail.com> on 2014/10/27 09:38:47 UTC, 4 replies.
- Re: SparkSQL display wrong result - posted by Cheng Lian <li...@gmail.com> on 2014/10/27 09:48:14 UTC, 0 replies.
- Sort-based shuffle did not work as expected - posted by "sunfl@certusnet.com.cn" <su...@certusnet.com.cn> on 2014/10/27 10:41:13 UTC, 1 replies.
- Re: 答复: SparkSQL display wrong result - posted by Cheng Lian <li...@gmail.com> on 2014/10/27 11:00:05 UTC, 0 replies.
- NoSuchMethodError: cassandra.thrift.ITransportFactory.openTransport() - posted by Sasi <sa...@gmail.com> on 2014/10/27 11:00:58 UTC, 4 replies.
- Fwd: [akka-user] Akka Camel plus Spark Streaming - posted by Patrick McGloin <mc...@gmail.com> on 2014/10/27 11:33:16 UTC, 1 replies.
- Missing java.util.Date class error while running Spark job - posted by Saket Kumar <sa...@bgch.co.uk> on 2014/10/27 13:10:39 UTC, 1 replies.
- OutOfMemory in "cogroup" - posted by Shixiong Zhu <zs...@gmail.com> on 2014/10/27 13:52:00 UTC, 2 replies.
- How to avoid use snappy compression when saveAsSequenceFile? - posted by buring <qy...@gmail.com> on 2014/10/27 14:13:06 UTC, 5 replies.
- Unsubscribe - posted by Ian Ferreira <ia...@hotmail.com> on 2014/10/27 14:50:00 UTC, 2 replies.
- How to set JAVA_HOME with --deploy-mode cluster - posted by Thomas Risberg <tr...@pivotal.io> on 2014/10/27 15:26:34 UTC, 1 replies.
- Is Spark 1.1.0 incompatible with Hive? - posted by nitinkak001 <ni...@gmail.com> on 2014/10/27 16:36:02 UTC, 5 replies.
- Problem to run spark as standalone - posted by java8964 <ja...@hotmail.com> on 2014/10/27 16:38:32 UTC, 1 replies.
- What this exception means? ConnectionManager: key already cancelled ? - posted by shahab <sh...@gmail.com> on 2014/10/27 16:43:56 UTC, 1 replies.
- deploying a model built in mllib - posted by chirag lakhani <ch...@gmail.com> on 2014/10/27 16:56:29 UTC, 2 replies.
- Measuring Performance in Spark - posted by mahsa <ma...@gmail.com> on 2014/10/27 18:22:48 UTC, 5 replies.
- Java api overhead? - posted by Sonal Goyal <so...@gmail.com> on 2014/10/27 19:25:26 UTC, 3 replies.
- MLLib ALS ArrayIndexOutOfBoundsException with Scala Spark 1.1.0 - posted by Ilya Ganelin <il...@gmail.com> on 2014/10/27 19:36:46 UTC, 6 replies.
- using existing hive with spark sql - posted by "Pagliari, Roberto" <rp...@appcomsci.com> on 2014/10/27 20:12:08 UTC, 1 replies.
- Spark Shell strange worker Exception - posted by Paolo Platter <pa...@agilelab.it> on 2014/10/27 20:39:49 UTC, 1 replies.
- exact count using rdd.count()? - posted by Josh J <jo...@gmail.com> on 2014/10/27 21:29:34 UTC, 1 replies.
- Upcoming Scala DevFlow Training By the Bay, November 6-7 - posted by Alexy Khrabrov <al...@scalable.pro> on 2014/10/27 21:42:13 UTC, 0 replies.
- Spark to eliminate full-table scan latency - posted by Ron Ayoub <ro...@live.com> on 2014/10/27 21:47:09 UTC, 3 replies.
- Scala Spark IDE help - posted by Eric Tanner <er...@justenough.com> on 2014/10/27 22:03:15 UTC, 3 replies.
- Re: Submitting Spark job on cluster from dev environment - posted by Shailesh Birari <sb...@wynyardgroup.com> on 2014/10/27 22:07:41 UTC, 1 replies.
- Subquery in having-clause (Spark 1.1.0) - posted by Daniel Klinger <dk...@web-computing.de> on 2014/10/27 22:47:36 UTC, 3 replies.
- [SPARK SQL] kerberos error when creating database from beeline/ThriftServer2 - posted by Du Li <li...@yahoo-inc.com.INVALID> on 2014/10/27 22:48:45 UTC, 4 replies.
- Cant start spark-shell in CDH Spark Standalone 1.1.0+cdh5.2.0+56 - posted by Sanjay Subramanian <sa...@yahoo.com.INVALID> on 2014/10/27 22:58:01 UTC, 1 replies.
- Does JavaSchemaRDD inherit the Hive partitioning of data? - posted by nitinkak001 <ni...@gmail.com> on 2014/10/28 00:34:26 UTC, 4 replies.
- Batch of updates - posted by Flavio Pompermaier <po...@okkam.it> on 2014/10/28 00:45:11 UTC, 4 replies.
- Meaning of persistence levels -- setting persistence causing out of memory errors with pyspark - posted by Eric Jonas <jo...@eecs.berkeley.edu> on 2014/10/28 00:47:19 UTC, 0 replies.
- combine rdds? - posted by Josh J <jo...@gmail.com> on 2014/10/28 00:50:05 UTC, 1 replies.
- Fixed Sized Strings in Spark SQL - posted by agg212 <al...@brown.edu> on 2014/10/28 01:08:55 UTC, 0 replies.
- Filtering URLs from incoming Internet traffic(Stream data). feasible with spark streaming? - posted by Nasir Khan <na...@gmail.com> on 2014/10/28 05:15:55 UTC, 3 replies.
- Spark Streaming into Cassandra - NoClass ColumnMapper - posted by Harold Nguyen <ha...@nexgate.com> on 2014/10/28 05:22:08 UTC, 1 replies.
- Support Hive 0.13 .1 in Spark SQL - posted by "Cheng, Hao" <ha...@intel.com> on 2014/10/28 05:54:45 UTC, 1 replies.
- Why RDD is not cached? - posted by shahab <sh...@gmail.com> on 2014/10/28 08:18:40 UTC, 4 replies.
- sampling in spark - posted by Chengi Liu <ch...@gmail.com> on 2014/10/28 08:26:32 UTC, 4 replies.
- Singapore Meetup - posted by Social Marketing <we...@gmail.com> on 2014/10/28 09:03:00 UTC, 0 replies.
- Submiting Spark application through code - posted by sivarani <wh...@gmail.com> on 2014/10/28 09:20:24 UTC, 7 replies.
- Re: Spark Streaming - How to remove state for key - posted by sivarani <wh...@gmail.com> on 2014/10/28 09:26:39 UTC, 0 replies.
- Spark SQL reduce number of java threads - posted by Wanda Hawk <wa...@yahoo.com.INVALID> on 2014/10/28 09:38:01 UTC, 2 replies.
- How to import mllib.rdd.RDDFunctions into the spark-shell - posted by Stephen Boesch <ja...@gmail.com> on 2014/10/28 10:09:06 UTC, 4 replies.
- How to set Spark to perform only one map at once at each cluster node - posted by ja...@centrum.cz on 2014/10/28 10:27:10 UTC, 0 replies.
- SparkSql OutOfMemoryError - posted by Zhanfeng Huo <hu...@gmail.com> on 2014/10/28 10:33:48 UTC, 0 replies.
- Is There Any Benchmarks Comparing Spark SQL and Hive. - posted by Mars Max <ma...@baidu.com> on 2014/10/28 10:35:21 UTC, 2 replies.
- How many executor process does an application receives? - posted by shahab <sh...@gmail.com> on 2014/10/28 10:56:10 UTC, 1 replies.
- Spray client reports Exception: akka.actor.ActorSystem.dispatcher()Lscala/concurrent/ExecutionContext - posted by Jianshi Huang <ji...@gmail.com> on 2014/10/28 11:02:48 UTC, 7 replies.
- newbie question quickstart example sbt issue - posted by nl19856 <Ha...@gmail.com> on 2014/10/28 11:27:00 UTC, 3 replies.
- Re: How to set Spark to perform only one map at once at each cluster node - posted by Yanbo Liang <ya...@gmail.com> on 2014/10/28 11:49:07 UTC, 3 replies.
- Re: SparkSql OutOfMemoryError - posted by Yanbo Liang <ya...@gmail.com> on 2014/10/28 11:50:12 UTC, 1 replies.
- sbt error building spark : [FATAL] Non-resolvable parent POM: - posted by nl19856 <Ha...@gmail.com> on 2014/10/28 12:14:16 UTC, 0 replies.
- Suitability for spark for master worker distributed patterns... - posted by Sasha Kacanski <sk...@gmail.com> on 2014/10/28 12:44:46 UTC, 0 replies.
- How can number of partitions be set in "spark-env.sh"? - posted by shahab <sh...@gmail.com> on 2014/10/28 14:20:14 UTC, 3 replies.
- GraphX StackOverflowError - posted by Zuhair Khayyat <zu...@gmail.com> on 2014/10/28 14:27:20 UTC, 1 replies.
- Deploying Spark on Stand alone cluster - posted by TravisJ <j....@gmail.com> on 2014/10/28 14:38:20 UTC, 0 replies.
- Streaming window operations not producing output - posted by diogo <di...@uken.com> on 2014/10/28 15:24:02 UTC, 0 replies.
- Ending a job early - posted by Jim Carroll <ji...@gmail.com> on 2014/10/28 15:27:23 UTC, 1 replies.
- pySpark - convert log/txt files into sequenceFile - posted by Csaba Ragany <ra...@gmail.com> on 2014/10/28 15:47:30 UTC, 3 replies.
- install sbt - posted by "Pagliari, Roberto" <rp...@appcomsci.com> on 2014/10/28 15:54:02 UTC, 4 replies.
- java.lang.IllegalArgumentException: requirement failed: sizeInBytes was negative: -9223372036842471144 - posted by "Ruebenacker, Oliver A" <Ol...@altisource.com> on 2014/10/28 16:04:15 UTC, 0 replies.
- Selecting Based on Nested Values using Language Integrated Query Syntax - posted by Brett Antonides <ba...@gmail.com> on 2014/10/28 16:30:59 UTC, 10 replies.
- Saving to Cassandra from Spark Streaming - posted by Harold Nguyen <ha...@nexgate.com> on 2014/10/28 17:34:56 UTC, 2 replies.
- JdbcRDD in Java - posted by Ron Ayoub <ro...@live.com> on 2014/10/28 17:37:21 UTC, 1 replies.
- Re: Keep state inside map function - posted by Koert Kuipers <ko...@tresata.com> on 2014/10/28 18:10:05 UTC, 0 replies.
- real-time streaming - posted by ll <du...@gmail.com> on 2014/10/28 18:44:38 UTC, 2 replies.
- Re: Spark Streaming and Storm - posted by critikaled <is...@gmail.com> on 2014/10/28 19:03:44 UTC, 0 replies.
- Including jars in Spark-shell vs Spark-submit - posted by Harold Nguyen <ha...@nexgate.com> on 2014/10/28 19:08:08 UTC, 3 replies.
- Is Spark in Java a bad idea? - posted by Ron Ayoub <ro...@live.com> on 2014/10/28 19:27:34 UTC, 5 replies.
- Yarn-Client Python - posted by TJ Klein <TJ...@gmail.com> on 2014/10/28 20:50:02 UTC, 2 replies.
- Re: Submitting Spark job on Unix cluster from dev environment (Windows) - posted by Shailesh Birari <sb...@wynyardgroup.com> on 2014/10/28 21:09:39 UTC, 2 replies.
- problem with start-slaves.sh - posted by "Pagliari, Roberto" <rp...@appcomsci.com> on 2014/10/28 21:32:06 UTC, 4 replies.
- Spark-submt job "Killed" - posted by akhandeshi <am...@gmail.com> on 2014/10/28 22:32:25 UTC, 2 replies.
- Building Spark against Cloudera 5.2.0 - Failure - posted by "Ganelin, Ilya" <Il...@capitalone.com> on 2014/10/28 22:35:16 UTC, 1 replies.
- unsubscribe - posted by Ricky Thomas <ri...@truedash.io> on 2014/10/28 23:02:24 UTC, 3 replies.
- Is it possible to call a transform + action inside an action? - posted by kpeng1 <kp...@gmail.com> on 2014/10/28 23:33:17 UTC, 3 replies.
- How to properly debug spark streaming? - posted by kpeng1 <kp...@gmail.com> on 2014/10/28 23:59:05 UTC, 0 replies.
- run multiple spark applications in parallel - posted by Josh J <jo...@gmail.com> on 2014/10/29 00:05:47 UTC, 4 replies.
- how to retrieve the value of a column of type date/timestamp from a Spark SQL Row - posted by Mohammed Guller <mo...@glassbeam.com> on 2014/10/29 01:03:36 UTC, 3 replies.
- com.esotericsoftware.kryo.KryoException: Encountered unregistered class ID: 13994 - posted by Steve Lewis <lo...@gmail.com> on 2014/10/29 03:43:19 UTC, 2 replies.
- Use RDD like a Iterator - posted by "Dai, Kevin" <yu...@ebay.com> on 2014/10/29 04:15:36 UTC, 5 replies.
- Re: cannot run spark shell in yarn-client mode - posted by TJ Klein <TJ...@gmail.com> on 2014/10/29 06:24:14 UTC, 0 replies.
- FileNotFoundException in appcache shuffle files - posted by Ryan Williams <ry...@gmail.com> on 2014/10/29 06:31:09 UTC, 3 replies.
- Spark Streaming from Kafka - posted by Harold Nguyen <ha...@nexgate.com> on 2014/10/29 07:20:54 UTC, 3 replies.
- How to retrive spark context when hiveContext is used in sparkstreaming - posted by critikaled <is...@gmail.com> on 2014/10/29 07:26:25 UTC, 0 replies.
- Spark streaming and save to cassandra and elastic search - posted by aarthi <aa...@gmail.com> on 2014/10/29 09:33:17 UTC, 1 replies.
- sbt/sbt compile error [FATAL] - posted by HansPeterS <Ha...@gmail.com> on 2014/10/29 11:55:51 UTC, 0 replies.
- Re: sbt/sbt compile error [FATAL] - posted by Soumya Simanta <so...@gmail.com> on 2014/10/29 12:39:43 UTC, 1 replies.
- Spark 1.1.0 on Hive 0.13.1 - posted by "Arthur.hk.chan@gmail.com" <ar...@gmail.com> on 2014/10/29 12:43:45 UTC, 3 replies.
- Streaming Question regarding lazy calculations - posted by sivarani <wh...@gmail.com> on 2014/10/29 13:55:09 UTC, 0 replies.
- "CANNOT FIND ADDRESS" - posted by akhandeshi <am...@gmail.com> on 2014/10/29 14:35:28 UTC, 4 replies.
- Spark Performance - posted by akhandeshi <am...@gmail.com> on 2014/10/29 14:55:07 UTC, 0 replies.
- XML Utilities for Apache Spark - posted by Darin McBeath <dd...@yahoo.com.INVALID> on 2014/10/29 16:49:42 UTC, 0 replies.
- PySpark and Cassandra 2.1 Examples - posted by Mike Sukmanowsky <mi...@gmail.com> on 2014/10/29 17:01:13 UTC, 1 replies.
- Spark Streaming with Kinesis - posted by Harold Nguyen <ha...@nexgate.com> on 2014/10/29 17:22:58 UTC, 2 replies.
- Re: Unit Testing (JUnit) with Spark - posted by touchdown <yu...@gmail.com> on 2014/10/29 17:26:14 UTC, 0 replies.
- Spark SQL and confused about number of partitions/tasks to do a simple join. - posted by Darin McBeath <dd...@yahoo.com.INVALID> on 2014/10/29 18:55:04 UTC, 2 replies.
- winutils - posted by Ron Ayoub <ro...@live.com> on 2014/10/29 19:31:56 UTC, 4 replies.
- Questions about serialization and SparkConf - posted by Steve Lewis <lo...@gmail.com> on 2014/10/29 19:57:09 UTC, 1 replies.
- Spark SQL - how to query dates stored as millis? - posted by bkarels <bk...@gmail.com> on 2014/10/29 20:49:58 UTC, 0 replies.
- difference between --jars and --driver-class-path - posted by freedafeng <fr...@yahoo.com> on 2014/10/29 23:09:06 UTC, 0 replies.
- Convert DStream to String - posted by Harold Nguyen <ha...@nexgate.com> on 2014/10/29 23:15:06 UTC, 3 replies.
- what does DStream.union() do? - posted by spr <sp...@yarcdata.com> on 2014/10/29 23:15:41 UTC, 3 replies.
- BUG: when running as "extends App", closures don't capture variables - posted by Michael Albert <m_...@yahoo.com.INVALID> on 2014/10/29 23:16:36 UTC, 2 replies.
- how to extract/combine elements of an Array in DStream element? - posted by spr <sp...@yarcdata.com> on 2014/10/29 23:29:59 UTC, 1 replies.
- Spark related meet up on Nov 6th in SF - posted by Alexis Roos <al...@gmail.com> on 2014/10/29 23:38:04 UTC, 0 replies.
- Spark with HLists - posted by Simon Hafner <re...@gmail.com> on 2014/10/30 00:05:39 UTC, 1 replies.
- How does custom partitioning in PySpark work? - posted by Def_Os <nj...@gmail.com> on 2014/10/30 00:20:59 UTC, 0 replies.
- use additional ebs volumes for hsdf storage with spark-ec2 - posted by Daniel Mahler <dm...@gmail.com> on 2014/10/30 00:51:49 UTC, 2 replies.
- SparkSQL: Nested Query error - posted by SK <sk...@gmail.com> on 2014/10/30 01:45:24 UTC, 2 replies.
- Re: How to incorporate the new data in the MLlib-NaiveBayes model along with predicting? - posted by Chris Fregly <ch...@fregly.com> on 2014/10/30 01:46:25 UTC, 0 replies.
- Task Size Increases when using loops - posted by nsareen <ns...@gmail.com> on 2014/10/30 02:47:58 UTC, 0 replies.
- GC Issues with randomSplit on large dataset - posted by "Ganelin, Ilya" <Il...@capitalone.com> on 2014/10/30 03:29:56 UTC, 3 replies.
- spark-submit results in NoClassDefFoundError - posted by Tobias Pfeiffer <tg...@preferred.jp> on 2014/10/30 03:50:05 UTC, 1 replies.
- Algebird using spark-shell - posted by bdev <bu...@gmail.com> on 2014/10/30 05:19:12 UTC, 5 replies.
- MLLib: libsvm - default value initialization - posted by Sameer Tilak <ss...@live.com> on 2014/10/30 05:20:57 UTC, 1 replies.
- Spark Debugging - posted by Naveen Kumar Pokala <np...@spcapitaliq.com> on 2014/10/30 07:35:01 UTC, 3 replies.
- Embedding Spark Masters+Zk, Workers, SparkContext, App in single JVM, clustered (sorta for symmetric deployment) - posted by Aditya Varun Chadha <ad...@gmail.com> on 2014/10/30 08:07:38 UTC, 0 replies.
- NonSerializable Exception in foreachRDD - posted by Harold Nguyen <ha...@nexgate.com> on 2014/10/30 08:55:29 UTC, 2 replies.
- sharing RDDs between PySpark and Scala - posted by rok <ro...@gmail.com> on 2014/10/30 11:11:57 UTC, 0 replies.
- Spark + Tableau - posted by Bojan Kostic <bl...@gmail.com> on 2014/10/30 12:23:14 UTC, 4 replies.
- Getting vector values - posted by Andrejs Abele <an...@sindicetech.com> on 2014/10/30 12:38:01 UTC, 1 replies.
- Returned type of Broadcast variable is byte array - posted by Stephen Boesch <ja...@gmail.com> on 2014/10/30 15:42:06 UTC, 1 replies.
- Using a Database to persist and load data from - posted by Asaf Lahav <as...@gmail.com> on 2014/10/30 16:01:15 UTC, 3 replies.
- issue on applying SVM to 5 million examples. - posted by peng xia <to...@gmail.com> on 2014/10/30 16:22:59 UTC, 7 replies.
- Best way to partition RDD - posted by shahab <sh...@gmail.com> on 2014/10/30 17:16:11 UTC, 4 replies.
- Manipulating RDDs within a DStream - posted by Harold Nguyen <ha...@nexgate.com> on 2014/10/30 17:58:36 UTC, 7 replies.
- Doing RDD."count" in parallel , at at least parallelize it as much as possible? - posted by shahab <sh...@gmail.com> on 2014/10/30 18:25:43 UTC, 4 replies.
- k-mean - result interpretation - posted by mgCl2 <fl...@gmail.com> on 2014/10/30 18:35:22 UTC, 1 replies.
- stage failure: java.lang.IllegalStateException: unread block data - posted by freedafeng <fr...@yahoo.com> on 2014/10/30 19:06:58 UTC, 1 replies.
- Re: Ambiguous references to id : what does it mean ? - posted by Terry Siu <Te...@smartfocus.com> on 2014/10/30 19:20:05 UTC, 0 replies.
- does updateStateByKey accept a state that is a tuple? - posted by spr <sp...@yarcdata.com> on 2014/10/30 19:38:49 UTC, 2 replies.
- Do Spark executors restrict native heap vs JVM heap? - posted by Paul Wais <pw...@yelp.com> on 2014/10/30 19:41:22 UTC, 1 replies.
- Re: Out of memory with Spark Streaming - posted by Chris Fregly <ch...@fregly.com> on 2014/10/30 20:45:16 UTC, 1 replies.
- akka connection refused bug, fix? - posted by freedafeng <fr...@yahoo.com> on 2014/10/30 21:22:20 UTC, 1 replies.
- Registering custom metrics - posted by Gerard Maas <ge...@gmail.com> on 2014/10/30 21:53:35 UTC, 0 replies.
- Creating a SchemaRDD from RDD of thrift classes - posted by ankits <an...@gmail.com> on 2014/10/30 22:13:35 UTC, 1 replies.
- how idf is calculated - posted by Andrejs Abele <an...@sindicetech.com> on 2014/10/30 23:13:49 UTC, 3 replies.
- SparkSQL + Hive Cached Table Exception - posted by Jean-Pascal Billaud <jp...@tellapart.com> on 2014/10/31 00:04:53 UTC, 1 replies.
- SparkContext UI - posted by Stuart Horsman <st...@gmail.com> on 2014/10/31 00:30:34 UTC, 4 replies.
- Scaladoc - posted by Alessandro Baretta <al...@gmail.com> on 2014/10/31 01:05:52 UTC, 1 replies.
- Confused about class paths in spark 1.1.0 - posted by Shay Seng <sh...@urbanengines.com> on 2014/10/31 01:24:53 UTC, 3 replies.
- Re: [scala-user] Why aggregate is inconsistent? - posted by Xuefeng Wu <be...@gmail.com> on 2014/10/31 03:48:18 UTC, 0 replies.
- SizeEstimator in Spark 1.1 and high load/object allocation when reading in data - posted by Erik Freed <er...@codecision.com> on 2014/10/31 04:34:03 UTC, 0 replies.
- Spark Streaming Issue not running 24/7 - posted by sivarani <wh...@gmail.com> on 2014/10/31 05:39:51 UTC, 1 replies.
- different behaviour of the same code - posted by lieyan <li...@yahoo.com> on 2014/10/31 08:31:26 UTC, 0 replies.
- about aggregateByKey and standard deviation - posted by qinwei <we...@dewmobile.net> on 2014/10/31 08:56:42 UTC, 0 replies.
- ERROR ConnectionManager: Corresponding SendingConnection to ConnectionManagerId not found - posted by "Dai, Kevin" <yu...@ebay.com> on 2014/10/31 10:28:33 UTC, 0 replies.
- Spark SQL on Cassandra - posted by cis <cr...@gmail.com> on 2014/10/31 11:00:57 UTC, 1 replies.
- Repartitioning by partition size, not by number of partitions. - posted by ja...@centrum.cz on 2014/10/31 11:26:51 UTC, 2 replies.
- SQL COUNT DISTINCT - posted by Bojan Kostic <bl...@gmail.com> on 2014/10/31 12:45:56 UTC, 2 replies.
- Too many files open with Spark 1.1 and CDH 5.1 - posted by Bill Q <bi...@gmail.com> on 2014/10/31 15:25:29 UTC, 3 replies.
- SparkContext.stop() ? - posted by ll <du...@gmail.com> on 2014/10/31 16:12:19 UTC, 4 replies.
- Re: spark streaming - saving kafka DStream into hadoop throws exception - posted by Sean Owen <so...@cloudera.com> on 2014/10/31 17:19:09 UTC, 0 replies.
- A Spark Design Problem - posted by Steve Lewis <lo...@gmail.com> on 2014/10/31 17:44:20 UTC, 2 replies.
- properties file on a spark cluster - posted by Daniel Takabayashi <ta...@scanboo.com.br> on 2014/10/31 18:02:50 UTC, 0 replies.
- Accessing Cassandra with SparkSQL, Does not work? - posted by shahab <sh...@gmail.com> on 2014/10/31 18:25:18 UTC, 4 replies.
- LinearRegression and model prediction threshold - posted by Sameer Tilak <ss...@live.com> on 2014/10/31 19:18:40 UTC, 2 replies.
- Example of Fold - posted by Ron Ayoub <ro...@live.com> on 2014/10/31 20:01:16 UTC, 1 replies.
- Cannot instantiate hive context - posted by Pala M Muthaia <mc...@rocketfuelinc.com> on 2014/10/31 20:04:13 UTC, 0 replies.
- Spark Build - posted by Terry Siu <Te...@smartfocus.com> on 2014/10/31 20:21:08 UTC, 2 replies.
- Spark Standalone on cluster stops - posted by TJ Klein <TJ...@gmail.com> on 2014/10/31 21:20:44 UTC, 0 replies.
- Re: Help with error initializing SparkR. - posted by tongzzz <to...@gmail.com> on 2014/10/31 22:08:45 UTC, 0 replies.
- hadoop_conf_dir when running spark on yarn - posted by ameyc <am...@gmail.com> on 2014/10/31 23:44:27 UTC, 0 replies.