You are viewing a plain text version of this content. The canonical link for it is here.
- SparkSQL performance - posted by Soumya Simanta <so...@gmail.com> on 2014/11/01 00:04:56 UTC, 4 replies.
- Spark Meetup in Singapore - posted by Social Marketing <we...@gmail.com> on 2014/11/01 01:12:02 UTC, 0 replies.
- Re: SparkContext UI - posted by Stuart Horsman <st...@gmail.com> on 2014/11/01 01:27:56 UTC, 0 replies.
- Some of the statistics function in SparkSQL is very slow - posted by Kevin Paul <ke...@gmail.com> on 2014/11/01 01:30:50 UTC, 0 replies.
- Re: "CANNOT FIND ADDRESS" - posted by Akhil Das <ak...@sigmoidanalytics.com> on 2014/11/01 08:53:42 UTC, 1 replies.
- --executor-cores cannot change vcores in yarn? - posted by Gen <ge...@gmail.com> on 2014/11/01 11:15:11 UTC, 2 replies.
- Re: use additional ebs volumes for hsdf storage with spark-ec2 - posted by Marius Soutier <mp...@gmail.com> on 2014/11/01 16:23:54 UTC, 0 replies.
- Re: SparkSQL + Hive Cached Table Exception - posted by Cheng Lian <li...@gmail.com> on 2014/11/01 16:35:04 UTC, 2 replies.
- Re: A Spark Design Problem - posted by Steve Lewis <lo...@gmail.com> on 2014/11/01 18:27:31 UTC, 0 replies.
- Re: stage failure: java.lang.IllegalStateException: unread block data - posted by TJ Klein <TJ...@gmail.com> on 2014/11/01 19:23:51 UTC, 0 replies.
- Re: Spark speed performance - posted by ja...@centrum.cz on 2014/11/01 20:09:41 UTC, 2 replies.
- OOM with groupBy + saveAsTextFile - posted by Bharath Ravi Kumar <re...@gmail.com> on 2014/11/01 20:36:22 UTC, 9 replies.
- union of SchemaRDDs - posted by Daniel Mahler <dm...@gmail.com> on 2014/11/01 23:57:34 UTC, 3 replies.
- org.apache.hadoop.security.UserGroupInformation.doAs Issue - posted by TJ Klein <TJ...@gmail.com> on 2014/11/02 02:51:55 UTC, 0 replies.
- Re: Spark SQL : how to find element where a field is in a given set - posted by abhinav chowdary <ab...@gmail.com> on 2014/11/02 03:51:41 UTC, 1 replies.
- How to correctly extimate the number of partition of a graph in GraphX - posted by James <al...@gmail.com> on 2014/11/02 06:57:26 UTC, 2 replies.
- Re: Submiting Spark application through code - posted by Marius Soutier <mp...@gmail.com> on 2014/11/02 10:17:32 UTC, 2 replies.
- Spark on Yarn probably trying to load all the data to RAM - posted by ja...@centrum.cz on 2014/11/02 10:35:10 UTC, 6 replies.
- Re: properties file on a spark cluster - posted by Akhil Das <ak...@sigmoidanalytics.com> on 2014/11/02 17:26:44 UTC, 0 replies.
- Re: ExecutorLostFailure (executor lost) - posted by Akhil Das <ak...@sigmoidanalytics.com> on 2014/11/02 17:38:09 UTC, 0 replies.
- Re: Cannot instantiate hive context - posted by Akhil Das <ak...@sigmoidanalytics.com> on 2014/11/02 17:41:10 UTC, 2 replies.
- Re: hadoop_conf_dir when running spark on yarn - posted by Akhil Das <ak...@sigmoidanalytics.com> on 2014/11/02 17:47:57 UTC, 3 replies.
- RE: Prediction using Classification with text attributes in Apache Spark MLLib - posted by ashu <as...@iiitb.org> on 2014/11/02 18:20:55 UTC, 1 replies.
- Spark Master Web UI showing "0 cores" in Completed Applications - posted by Justin Yip <yi...@gmail.com> on 2014/11/03 03:26:42 UTC, 0 replies.
- How do I kill av job submitted with spark-submit - posted by Steve Lewis <lo...@gmail.com> on 2014/11/03 04:57:02 UTC, 0 replies.
- Do Spark executors restrict native heap vs JVM heap? - posted by Paul Wais <pw...@yelp.com> on 2014/11/03 05:40:35 UTC, 1 replies.
- Spark SQL takes unexpected time - posted by Shailesh Birari <sb...@wynyardgroup.com> on 2014/11/03 05:47:24 UTC, 4 replies.
- Re: Does SparkSQL work with custom defined SerDe? - posted by Chirag Aggarwal <Ch...@guavus.com> on 2014/11/03 06:00:42 UTC, 0 replies.
- Spark cluster stability - posted by jatinpreet <ja...@gmail.com> on 2014/11/03 06:25:11 UTC, 2 replies.
- Parquet files are only 6-20MB in size? - posted by ag007 <ag...@mac.com> on 2014/11/03 08:42:40 UTC, 4 replies.
- graph x extracting the path - posted by dizzy5112 <da...@gmail.com> on 2014/11/03 08:45:08 UTC, 0 replies.
- To find distances to reachable source vertices using GraphX - posted by Madabhattula Rajesh Kumar <mr...@gmail.com> on 2014/11/03 09:23:37 UTC, 2 replies.
- Re: SQL COUNT DISTINCT - posted by Bojan Kostic <bl...@gmail.com> on 2014/11/03 09:45:20 UTC, 2 replies.
- GraphX : Vertices details in Triangles - posted by Madabhattula Rajesh Kumar <mr...@gmail.com> on 2014/11/03 10:08:06 UTC, 1 replies.
- Re: about aggregateByKey and standard deviation - posted by Kamal Banga <ka...@sigmoidanalytics.com> on 2014/11/03 10:53:17 UTC, 0 replies.
- How number of partitions effect the performance? - posted by shahab <sh...@gmail.com> on 2014/11/03 10:57:14 UTC, 2 replies.
- Hive Context and Mapr - posted by "Addanki, Santosh Kumar" <sa...@sap.com> on 2014/11/03 11:34:40 UTC, 0 replies.
- Schema RDD and saveAsTable in hive - posted by "Addanki, Santosh Kumar" <sa...@sap.com> on 2014/11/03 12:05:18 UTC, 1 replies.
- unsubscribe - posted by Karthikeyan Arcot Kuppusamy <ka...@zanec.com> on 2014/11/03 13:23:19 UTC, 3 replies.
- Bug in DISK related Storage level? - posted by James <al...@gmail.com> on 2014/11/03 13:43:40 UTC, 0 replies.
- Dynamically switching Nr of allocated core - posted by RodrigoB <ro...@aspect.com> on 2014/11/03 14:17:44 UTC, 1 replies.
- Spark Kafka Performance - posted by Eduardo Costa Alfaia <e....@unibs.it> on 2014/11/03 15:57:05 UTC, 1 replies.
- Spark job resource allocation best practices - posted by Romi Kuntsman <ro...@totango.com> on 2014/11/03 16:40:11 UTC, 8 replies.
- Cincinnati, OH Meetup for Apache Spark - posted by Darin McBeath <dd...@yahoo.com.INVALID> on 2014/11/03 17:05:50 UTC, 0 replies.
- random shuffle streaming RDDs? - posted by Josh J <jo...@gmail.com> on 2014/11/03 17:33:44 UTC, 5 replies.
- Key-Value decomposition - posted by david <da...@free.fr> on 2014/11/03 17:38:31 UTC, 4 replies.
- Accumulables in transformation operations - posted by Jorge Lopez-Malla <jl...@stratio.com> on 2014/11/03 18:16:16 UTC, 0 replies.
- Re: akka connection refused bug, fix? - posted by freedafeng <fr...@yahoo.com> on 2014/11/03 18:44:11 UTC, 0 replies.
- NoClassDefFoundError encountered in Spark 1.2-snapshot build with hive-0.13.1 profile - posted by Terry Siu <Te...@smartfocus.com> on 2014/11/03 19:42:19 UTC, 3 replies.
- MLlib - Naive Bayes Java example bug - posted by Dariusz Kobylarz <da...@gmail.com> on 2014/11/03 19:46:06 UTC, 1 replies.
- ParquetFilters and StringType support for GT, GTE, LT, LTE - posted by Terry Siu <Te...@smartfocus.com> on 2014/11/03 20:04:53 UTC, 2 replies.
- Model characterization - posted by Sameer Tilak <ss...@live.com> on 2014/11/03 21:52:54 UTC, 3 replies.
- Any "Replicated" RDD in Spark? - posted by Shuai Zheng <sz...@gmail.com> on 2014/11/03 22:03:17 UTC, 5 replies.
- Memory limitation on EMR Node? - posted by Shuai Zheng <sz...@gmail.com> on 2014/11/03 22:11:30 UTC, 0 replies.
- with SparkStreeaming spark-submit, don't see output after ssc.start() - posted by spr <sp...@yarcdata.com> on 2014/11/03 22:12:53 UTC, 5 replies.
- OOM - Requested array size exceeds VM limit - posted by akhandeshi <am...@gmail.com> on 2014/11/03 22:44:52 UTC, 0 replies.
- Spark streaming job failed due to "java.util.concurrent.TimeoutException" - posted by Bill Jay <bi...@gmail.com> on 2014/11/03 22:52:15 UTC, 0 replies.
- is spark a good fit for sequential machine learning algorithms? - posted by ll <du...@gmail.com> on 2014/11/03 22:55:58 UTC, 2 replies.
- Deleting temp dir Exception - posted by Josh <jo...@youneeq.ca> on 2014/11/04 00:08:58 UTC, 0 replies.
- Snappy and spark 1.1 - posted by Aravind Srinivasan <ar...@altiscale.com> on 2014/11/04 01:11:02 UTC, 1 replies.
- Spark Streaming - Most popular Twitter Hashtags - posted by Harold Nguyen <ha...@nexgate.com> on 2014/11/04 01:33:25 UTC, 1 replies.
- ERROR UserGroupInformation: PriviledgedActionException - posted by Saiph Kappa <sa...@gmail.com> on 2014/11/04 01:37:15 UTC, 4 replies.
- IllegalStateException: unread block data - posted by freedafeng <fr...@yahoo.com> on 2014/11/04 01:48:15 UTC, 1 replies.
- Re: different behaviour of the same code - posted by Tobias Pfeiffer <tg...@preferred.jp> on 2014/11/04 02:38:45 UTC, 1 replies.
- cannot import name accumulators in python 2.7 - posted by felixgao <gu...@gmail.com> on 2014/11/04 02:42:34 UTC, 0 replies.
- How to make sure a ClassPath is always shipped to workers? - posted by Peng Cheng <pc...@uow.edu.au> on 2014/11/04 03:38:43 UTC, 4 replies.
- avro + parquet + vector + NullPointerException while reading - posted by Michael Albert <m_...@yahoo.com.INVALID> on 2014/11/04 04:33:53 UTC, 2 replies.
- Cleaning/transforming json befor converting to SchemaRDD - posted by Daniel Mahler <dm...@gmail.com> on 2014/11/04 05:04:51 UTC, 2 replies.
- Executor Log Rotation Is Not Working? - posted by Ji ZHANG <zh...@gmail.com> on 2014/11/04 05:24:11 UTC, 1 replies.
- Re: MatrixFactorizationModel predict(Int, Int) API - posted by Xiangrui Meng <me...@gmail.com> on 2014/11/04 06:24:09 UTC, 4 replies.
- netty on classpath when using spark-submit - posted by Tobias Pfeiffer <tg...@preferred.jp> on 2014/11/04 07:50:02 UTC, 3 replies.
- Got java.lang.SecurityException: class "javax.servlet.FilterRegistration"'s when running job from intellij Idea - posted by Jaonary Rabarisoa <ja...@gmail.com> on 2014/11/04 08:55:04 UTC, 3 replies.
- save as JSON objects - posted by Andrejs Abele <an...@insight-centre.org> on 2014/11/04 09:48:10 UTC, 2 replies.
- Re: java.io.NotSerializableException: org.apache.spark.SparkEnv - posted by sivarani <wh...@gmail.com> on 2014/11/04 10:54:23 UTC, 3 replies.
- pass unique ID to mllib algorithms pyspark - posted by jamborta <ja...@gmail.com> on 2014/11/04 11:30:51 UTC, 2 replies.
- loading, querying schemaRDD using SparkSQL - posted by "vdiwakar.malladi" <vd...@gmail.com> on 2014/11/04 12:18:46 UTC, 4 replies.
- stdout in spark applications - posted by lokeshkumar <lo...@dataken.net> on 2014/11/04 13:45:14 UTC, 1 replies.
- Spark Streaming getOrCreate - posted by sivarani <wh...@gmail.com> on 2014/11/04 16:01:09 UTC, 2 replies.
- Re: Streaming window operations not producing output - posted by diogo <di...@uken.com> on 2014/11/04 16:20:50 UTC, 3 replies.
- MEMORY_ONLY_SER question - posted by Mohit Jaggi <mo...@gmail.com> on 2014/11/04 18:22:15 UTC, 4 replies.
- SparkSQL - No support for subqueries in 1.2-snapshot? - posted by Terry Siu <Te...@smartfocus.com> on 2014/11/04 18:40:18 UTC, 2 replies.
- Streaming: which code is (not) executed at every batch interval? - posted by spr <sp...@yarcdata.com> on 2014/11/04 18:43:39 UTC, 6 replies.
- Spark Streaming appears not to recognize a more recent version of an already-seen file; true? - posted by spr <sp...@yarcdata.com> on 2014/11/04 19:41:28 UTC, 2 replies.
- What's wrong with my settings about shuffle/storage.memoryFraction - posted by Benyi Wang <be...@gmail.com> on 2014/11/04 20:24:07 UTC, 0 replies.
- scala RDD sortby compilation error - posted by Josh J <jo...@gmail.com> on 2014/11/04 20:28:30 UTC, 3 replies.
- Fwd: Master example.MovielensALS - posted by Debasish Das <de...@gmail.com> on 2014/11/04 20:33:39 UTC, 0 replies.
- [ANN] Spark resources searchable - posted by Otis Gospodnetic <ot...@gmail.com> on 2014/11/04 21:12:52 UTC, 0 replies.
- spark sql create nested schema - posted by tridib <tr...@live.com> on 2014/11/04 21:19:03 UTC, 1 replies.
- StructField of StructType - posted by tridib <tr...@live.com> on 2014/11/04 21:21:21 UTC, 1 replies.
- Best practice for join - posted by Benyi Wang <be...@gmail.com> on 2014/11/04 21:23:46 UTC, 3 replies.
- Workers not registering after master restart - posted by Ashic Mahtab <as...@live.com> on 2014/11/05 00:00:42 UTC, 2 replies.
- Spark v Redshift - posted by agfung <ag...@gmail.com> on 2014/11/05 00:11:19 UTC, 6 replies.
- stackoverflow error - posted by Hongbin Liu <Ho...@theice.com> on 2014/11/05 00:13:52 UTC, 2 replies.
- Re: How to ship cython library to workers? - posted by freedafeng <fr...@yahoo.com> on 2014/11/05 00:28:12 UTC, 0 replies.
- Re: deploying a model built in mllib - posted by Simon Chan <si...@gmail.com> on 2014/11/05 01:57:33 UTC, 2 replies.
- spark_ec2.py for AWS region: cn-north-1, China - posted by "haitao .yao" <ya...@gmail.com> on 2014/11/05 02:09:15 UTC, 4 replies.
- Using SQL statements vs. SchemaRDD methods - posted by SK <sk...@gmail.com> on 2014/11/05 02:22:13 UTC, 1 replies.
- Why mapred for the HadoopRDD? - posted by Corey Nolet <cj...@gmail.com> on 2014/11/05 02:29:38 UTC, 1 replies.
- GraphX and Spark - posted by Deep Pradhan <pr...@gmail.com> on 2014/11/05 05:11:50 UTC, 1 replies.
- MLlib and PredictionIO sample code - posted by Simon Chan <si...@gmail.com> on 2014/11/05 06:19:43 UTC, 0 replies.
- Issue in Spark Streaming - posted by Suman S Patil <Su...@lntinfotech.com> on 2014/11/05 06:34:19 UTC, 1 replies.
- Kafka Consumer in Spark Streaming - posted by Something Something <ma...@gmail.com> on 2014/11/05 06:57:20 UTC, 9 replies.
- How to increase hdfs read parallelism - posted by Rajat Verma <ve...@gmail.com> on 2014/11/05 07:51:23 UTC, 0 replies.
- sparse x sparse matrix multiplication - posted by ll <du...@gmail.com> on 2014/11/05 08:58:11 UTC, 8 replies.
- Re: Matrix multiplication in spark - posted by ll <du...@gmail.com> on 2014/11/05 09:23:44 UTC, 3 replies.
- Re: NullPointerException on reading checkpoint files - posted by sivarani <wh...@gmail.com> on 2014/11/05 10:14:31 UTC, 1 replies.
- add support for separate GC log files for different executor - posted by "haitao .yao" <ya...@gmail.com> on 2014/11/05 10:53:38 UTC, 0 replies.
- Dynamically InferSchema From Hive and Create parquet file - posted by "Jahagirdar, Madhu" <ma...@philips.com> on 2014/11/05 11:15:39 UTC, 4 replies.
- Change in the API for streamingcontext.actorStream? - posted by Shiti Saxena <ss...@gmail.com> on 2014/11/05 11:30:51 UTC, 0 replies.
- using LogisticRegressionWithSGD.train in Python crashes with "Broken pipe" - posted by rok <ro...@gmail.com> on 2014/11/05 11:38:09 UTC, 5 replies.
- Standalone Specify mem / cores defaults - posted by Ashic Mahtab <as...@live.com> on 2014/11/05 12:21:19 UTC, 1 replies.
- Unsubscribe - posted by mrugen deshmukh <mr...@gmail.com> on 2014/11/05 13:33:28 UTC, 1 replies.
- why decision trees do binary split? - posted by jamborta <ja...@gmail.com> on 2014/11/05 14:04:17 UTC, 5 replies.
- Re: I want to make clear the difference about executor-cores number. - posted by jamborta <ja...@gmail.com> on 2014/11/05 14:13:53 UTC, 1 replies.
- Starting Spark Master on CDH5.2/Spark v1.1.0 fails. Indication is: 'SCALA_HOME is not set' - posted by prismalytics <su...@prismalytics.io> on 2014/11/05 17:58:00 UTC, 0 replies.
- Understanding spark operation pipeline and block storage - posted by Hao Ren <in...@gmail.com> on 2014/11/05 18:39:51 UTC, 4 replies.
- Any limitations of spark.shuffle.spill? - posted by Yangcheng Huang <ya...@huawei.com> on 2014/11/05 19:04:38 UTC, 1 replies.
- Not Specifying --class for spark-submit - posted by jgarrett <jo...@noblis-nsp.com> on 2014/11/05 19:27:48 UTC, 0 replies.
- Logging from the Spark shell - posted by "Ulanov, Alexander" <al...@hp.com> on 2014/11/05 21:57:27 UTC, 0 replies.
- AVRO specific records - posted by Simone Franzini <ca...@gmail.com> on 2014/11/05 22:25:30 UTC, 5 replies.
- Partition sorting by Spark framework - posted by nitinkak001 <ni...@gmail.com> on 2014/11/05 22:39:10 UTC, 2 replies.
- Question regarding sorting and grouping - posted by Ping Tang <pt...@aerohive.com> on 2014/11/05 23:30:39 UTC, 0 replies.
- cache function is not working on RDD from parallelize - posted by Edwin <al...@yahoo.com> on 2014/11/05 23:38:09 UTC, 0 replies.
- Configuring custom input format - posted by Corey Nolet <cj...@gmail.com> on 2014/11/05 23:49:54 UTC, 5 replies.
- [SQL] PERCENTILE is not working - posted by Kevin Paul <ke...@gmail.com> on 2014/11/06 00:08:39 UTC, 2 replies.
- Re: Breaking the previous large-scale sort record with Spark - posted by Reynold Xin <rx...@databricks.com> on 2014/11/06 00:11:42 UTC, 1 replies.
- how to blend a DStream and a broadcast variable? - posted by spr <sp...@yarcdata.com> on 2014/11/06 00:30:22 UTC, 2 replies.
- How to trace/debug serialization? - posted by ankits <an...@gmail.com> on 2014/11/06 00:56:52 UTC, 5 replies.
- SparkContext._lock Error - posted by "Pagliari, Roberto" <rp...@appcomsci.com> on 2014/11/06 01:21:45 UTC, 3 replies.
- How to get Spark User List Digests only, but still be able to post questions ... - posted by prismalytics <su...@prismalytics.io> on 2014/11/06 01:25:18 UTC, 0 replies.
- Spark SQL Hive Version - posted by "Cheng, Hao" <ha...@intel.com> on 2014/11/06 01:47:38 UTC, 1 replies.
- log4j logging control via sbt - posted by Simon Hafner <re...@gmail.com> on 2014/11/06 02:22:57 UTC, 1 replies.
- Re: How to avoid use snappy compression when saveAsSequenceFile? - posted by buring <qy...@gmail.com> on 2014/11/06 03:05:43 UTC, 1 replies.
- Errors in Spark streaming application due to HDFS append - posted by Ping Tang <pt...@aerohive.com> on 2014/11/06 03:33:55 UTC, 0 replies.
- Task size variation while using Range Vs List - posted by nsareen <ns...@gmail.com> on 2014/11/06 04:35:50 UTC, 2 replies.
- JavaStreamingContextFactory checkpoint directory NotSerializableException - posted by Vasu C <va...@gmail.com> on 2014/11/06 06:42:37 UTC, 5 replies.
- Re: Spark Streaming: foreachRDD network output - posted by sivarani <wh...@gmail.com> on 2014/11/06 07:31:32 UTC, 0 replies.
- Snappy temp files not cleaned up - posted by Romi Kuntsman <ro...@totango.com> on 2014/11/06 08:15:11 UTC, 1 replies.
- Number cores split up - posted by Naveen Kumar Pokala <np...@spcapitaliq.com> on 2014/11/06 08:27:54 UTC, 0 replies.
- Unable to use HiveContext in spark-shell - posted by tridib <tr...@live.com> on 2014/11/06 08:53:39 UTC, 7 replies.
- CheckPoint Issue with JsonRDD - posted by "Jahagirdar, Madhu" <ma...@philips.com> on 2014/11/06 10:06:17 UTC, 1 replies.
- multiple spark context in same driver program - posted by Paweł Szulc <pa...@gmail.com> on 2014/11/06 17:00:17 UTC, 3 replies.
- Task duration graph on Spark stage UI - posted by Daniel Darabos <da...@lynxanalytics.com> on 2014/11/06 18:04:40 UTC, 0 replies.
- RE: spark sql: join sql fails after sqlCtx.cacheTable() - posted by Tridib Samanta <tr...@live.com> on 2014/11/06 18:25:07 UTC, 0 replies.
- Spark and Kafka - posted by Eduardo Costa Alfaia <e....@unibs.it> on 2014/11/06 18:32:23 UTC, 1 replies.
- most efficient way to send data from Scala to python - posted by jamborta <ja...@gmail.com> on 2014/11/06 19:00:37 UTC, 0 replies.
- Re: PySpark issue with sortByKey: "IndexError: list index out of range" - posted by skane <sk...@websense.com> on 2014/11/06 19:39:10 UTC, 5 replies.
- specifying sort order for sort by value - posted by SK <sk...@gmail.com> on 2014/11/06 19:50:04 UTC, 2 replies.
- SparkSubmitDriverBootstrapper and JVM parameters - posted by akhandeshi <am...@gmail.com> on 2014/11/06 19:50:09 UTC, 0 replies.
- Kinesis integration with Spark Streaming in EMR cluster - Output is not showing up - posted by sriks <sr...@gmail.com> on 2014/11/06 20:51:32 UTC, 0 replies.
- Redploying a spark streaming application - posted by Ashic Mahtab <as...@live.com> on 2014/11/06 23:01:34 UTC, 3 replies.
- job works well on small data set but fails on large data set - posted by HARIPRIYA AYYALASOMAYAJULA <ah...@gmail.com> on 2014/11/07 00:38:30 UTC, 0 replies.
- Any patterns for multiplexing the streaming data - posted by bdev <bu...@gmail.com> on 2014/11/07 01:15:27 UTC, 2 replies.
- Store DStreams into Hive using Hive Streaming - posted by Luiz Geovani Vier <lg...@gmail.com> on 2014/11/07 01:46:33 UTC, 3 replies.
- Re: Selecting Based on Nested Values using Language Integrated Query Syntax - posted by Corey Nolet <cj...@gmail.com> on 2014/11/07 02:36:14 UTC, 0 replies.
- Is there a way to limit the sql query result size? - posted by sagi <zh...@gmail.com> on 2014/11/07 04:28:01 UTC, 0 replies.
- Collect method in Spark - posted by Deep Pradhan <pr...@gmail.com> on 2014/11/07 06:11:18 UTC, 1 replies.
- Nesting RDD - posted by Naveen Kumar Pokala <np...@spcapitaliq.com> on 2014/11/07 07:28:03 UTC, 1 replies.
- Parallelize on spark context - posted by Naveen Kumar Pokala <np...@spcapitaliq.com> on 2014/11/07 07:43:06 UTC, 3 replies.
- word2vec: how to save an mllib model and reload it? - posted by ll <du...@gmail.com> on 2014/11/07 08:26:34 UTC, 12 replies.
- Re: Bug in Accumulators... - posted by Shixiong Zhu <zs...@gmail.com> on 2014/11/07 09:03:25 UTC, 9 replies.
- sql - group by on UDF not working - posted by Tridib Samanta <tr...@live.com> on 2014/11/07 09:44:20 UTC, 1 replies.
- about write mongodb in mapPartitions - posted by qinwei <we...@dewmobile.net> on 2014/11/07 10:23:09 UTC, 4 replies.
- Native / C/C++ code integration - posted by Paul Wais <pa...@gmail.com> on 2014/11/07 12:05:18 UTC, 1 replies.
- Re: LZO support in Spark 1.0.0 - nothing seems to work - posted by Sree Harsha <99...@gmail.com> on 2014/11/07 13:22:40 UTC, 0 replies.
- MESOS slaves shut down due to "'health check timed out" - posted by Yangcheng Huang <ya...@huawei.com> on 2014/11/07 16:18:31 UTC, 0 replies.
- error when importing HiveContext - posted by "Pagliari, Roberto" <rp...@appcomsci.com> on 2014/11/07 17:15:33 UTC, 1 replies.
- where is the org.apache.spark.util package? - posted by ll <du...@gmail.com> on 2014/11/07 18:48:03 UTC, 1 replies.
- spark-submit inside script... need some bash help - posted by Koert Kuipers <ko...@tresata.com> on 2014/11/07 20:01:09 UTC, 1 replies.
- partitioning to speed up queries - posted by Gordon Benjamin <go...@gmail.com> on 2014/11/07 21:07:56 UTC, 0 replies.
- Multiple Applications(Spark Contexts) Concurrently Fail With Broadcast Error - posted by ryaminal <ta...@gmail.com> on 2014/11/07 21:17:08 UTC, 0 replies.
- Still struggling with building documentation - posted by Alessandro Baretta <al...@gmail.com> on 2014/11/07 21:39:11 UTC, 4 replies.
- jsonRdd and MapType - posted by boclair <bo...@gmail.com> on 2014/11/07 21:41:20 UTC, 1 replies.
- spark streaming: stderr does not roll - posted by "Nguyen, Duc" <du...@pearson.com> on 2014/11/07 23:35:22 UTC, 1 replies.
- Integrating Spark with other applications - posted by gtinside <gt...@gmail.com> on 2014/11/07 23:42:05 UTC, 1 replies.
- spark context not defined - posted by "Pagliari, Roberto" <rp...@appcomsci.com> on 2014/11/08 00:05:35 UTC, 1 replies.
- MatrixFactorizationModel serialization - posted by Dariusz Kobylarz <da...@gmail.com> on 2014/11/08 00:33:11 UTC, 1 replies.
- SparkPi endlessly in "yarnAppState: ACCEPTED" - posted by YaoPau <jo...@gmail.com> on 2014/11/08 01:40:40 UTC, 1 replies.
- Spark 1.1.0 Can not read snappy compressed sequence file - posted by Stéphane Verlet <ka...@gmail.com> on 2014/11/08 04:12:45 UTC, 1 replies.
- How to add elements into map? - posted by Tim Chou <ti...@gmail.com> on 2014/11/08 05:16:42 UTC, 3 replies.
- Re: Fwd: Why is Spark not using all cores on a single machine? - posted by ll <du...@gmail.com> on 2014/11/08 06:04:59 UTC, 1 replies.
- Re: Viewing web UI after fact - posted by Arun Ahuja <aa...@gmail.com> on 2014/11/08 07:11:46 UTC, 0 replies.
- Re: Using partitioning to speed up queries in Shark - posted by Mayur Rustagi <ma...@gmail.com> on 2014/11/08 08:13:00 UTC, 0 replies.
- Spark on YARN, ExecutorLostFailure for long running computations in map - posted by ja...@centrum.cz on 2014/11/08 10:28:58 UTC, 2 replies.
- Issue with Custom Key Class - posted by Bahubali Jain <ba...@gmail.com> on 2014/11/08 12:15:23 UTC, 1 replies.
- Re: org/apache/commons/math3/random/RandomGenerator issue - posted by lev <ka...@gmail.com> on 2014/11/08 13:20:07 UTC, 7 replies.
- Embedding static files in a spark app - posted by Jay Vyas <ja...@gmail.com> on 2014/11/08 14:15:50 UTC, 0 replies.
- Debian package for spark? - posted by Kevin Burton <bu...@spinn3r.com> on 2014/11/08 19:47:13 UTC, 7 replies.
- Do spark works on multicore systems? - posted by hmushtaq <pe...@gmail.com> on 2014/11/08 23:17:45 UTC, 1 replies.
- Does spark works on multicore systems? - posted by Blind Faith <pe...@gmail.com> on 2014/11/08 23:20:18 UTC, 3 replies.
- Unresolved Attributes - posted by Srinivas Chamarthi <sr...@gmail.com> on 2014/11/09 00:26:15 UTC, 1 replies.
- contains in array in Spark SQL - posted by Srinivas Chamarthi <sr...@gmail.com> on 2014/11/09 01:55:11 UTC, 0 replies.
- wierd caching - posted by Nathan Kronenfeld <nk...@oculusinfo.com> on 2014/11/09 07:16:28 UTC, 1 replies.
- Make Spark Job Board permanent. - posted by Egor Pahomov <pa...@gmail.com> on 2014/11/09 08:42:35 UTC, 0 replies.
- Why does this siimple spark program uses only one core? - posted by ReticulatedPython <pe...@gmail.com> on 2014/11/09 14:18:31 UTC, 2 replies.
- supported sql functions - posted by Srinivas Chamarthi <sr...@gmail.com> on 2014/11/09 14:57:44 UTC, 2 replies.
- Re: Submitting Spark job on Unix cluster from dev environment (Windows) - posted by Shailesh Birari <sb...@wynyardgroup.com> on 2014/11/09 21:01:53 UTC, 1 replies.
- Repartition to data-size per partition - posted by Harry Brundage <ha...@shopify.com> on 2014/11/10 05:19:32 UTC, 0 replies.
- embedded spark for unit testing.. - posted by Kevin Burton <bu...@spinn3r.com> on 2014/11/10 06:12:18 UTC, 1 replies.
- MLlib Naive Bayes classifier confidence - posted by jatinpreet <ja...@gmail.com> on 2014/11/10 06:45:09 UTC, 4 replies.
- Queues - posted by Deep Pradhan <pr...@gmail.com> on 2014/11/10 06:48:20 UTC, 0 replies.
- Rdd replication - posted by rapelly kartheek <ka...@gmail.com> on 2014/11/10 07:00:44 UTC, 0 replies.
- Efficient Key Structure in pairRDD - posted by nsareen <ns...@gmail.com> on 2014/11/10 08:39:07 UTC, 1 replies.
- canopy clustering - posted by aminn_524 <am...@yahoo.com> on 2014/11/10 08:54:08 UTC, 1 replies.
- "-Error stopping receiver" in running Spark+Flume sample code "FlumeEventCount.scala" - posted by Ping Tang <pt...@aerohive.com> on 2014/11/10 09:31:01 UTC, 0 replies.
- dealing with large values in kv pairs - posted by YANG Fan <id...@gmail.com> on 2014/11/10 09:34:22 UTC, 1 replies.
- closure serialization behavior driving me crazy - posted by Sandy Ryza <sa...@cloudera.com> on 2014/11/10 10:01:00 UTC, 3 replies.
- index File create by mapFile can't - posted by buring <qy...@gmail.com> on 2014/11/10 10:04:50 UTC, 0 replies.
- index File create by mapFile can't read - posted by buring <qy...@gmail.com> on 2014/11/10 10:07:33 UTC, 0 replies.
- Is there a step-by-step instruction on how to build Spark App with IntelliJ IDEA? - posted by MEETHU MATHEW <me...@yahoo.co.in> on 2014/11/10 11:38:24 UTC, 5 replies.
- Solidifying Understanding of Standalone Mode - posted by Ashic Mahtab <as...@live.com> on 2014/11/10 11:42:35 UTC, 0 replies.
- Spark Web UI is not showing Running / Completed / Active Applications - posted by Samarth Mailinglist <ma...@gmail.com> on 2014/11/10 11:55:48 UTC, 5 replies.
- Mysql retrieval and storage using JdbcRDD - posted by akshayhazari <ak...@gmail.com> on 2014/11/10 12:40:52 UTC, 0 replies.
- Backporting spark 1.1.0 to CDH 5.1.3 - posted by "Zalzberg, Idan (Agoda)" <Id...@agoda.com> on 2014/11/10 12:58:24 UTC, 2 replies.
- Removing INFO logs - posted by Ritesh Kumar Singh <ri...@gmail.com> on 2014/11/10 13:21:01 UTC, 2 replies.
- Executor Lost Failure - posted by Ritesh Kumar Singh <ri...@gmail.com> on 2014/11/10 14:21:02 UTC, 5 replies.
- Running Spark on SPARC64 X+ - posted by Greg Jennings <je...@gmail.com> on 2014/11/10 14:34:01 UTC, 0 replies.
- To generate IndexedRowMatrix from an RowMatrix - posted by Lijun Wang <wa...@gmail.com> on 2014/11/10 14:57:57 UTC, 3 replies.
- Increase Executor Memory on YARN - posted by Mudassar Sarwar <mu...@northbaysolutions.net> on 2014/11/10 14:58:32 UTC, 1 replies.
- Question about textFileStream - posted by Saiph Kappa <sa...@gmail.com> on 2014/11/10 18:20:23 UTC, 3 replies.
- spark SNAPSHOT repo - posted by jamborta <ja...@gmail.com> on 2014/11/10 18:40:18 UTC, 2 replies.
- Kafka version dependency in Spark 1.2 - posted by Bhaskar Dutta <bh...@gmail.com> on 2014/11/10 18:48:35 UTC, 3 replies.
- which is the recommended workflow engine for Apache Spark jobs? - posted by Adamantios Corais <ad...@gmail.com> on 2014/11/10 19:34:00 UTC, 3 replies.
- Mapping SchemaRDD/Row to JSON - posted by Akshat Aranya <aa...@gmail.com> on 2014/11/10 20:12:46 UTC, 2 replies.
- Re: disable log4j for spark-shell - posted by hmxxyy <hm...@gmail.com> on 2014/11/10 20:17:41 UTC, 5 replies.
- Status of MLLib exporting models to PMML - posted by Aris <ar...@gmail.com> on 2014/11/10 20:27:07 UTC, 12 replies.
- MLLib Decision Tress algorithm hangs, others fine - posted by tsj <ts...@gmail.com> on 2014/11/10 20:28:38 UTC, 1 replies.
- Custom persist or cache of RDD? - posted by Benyi Wang <be...@gmail.com> on 2014/11/10 20:33:20 UTC, 2 replies.
- Re: Spray client reports Exception: akka.actor.ActorSystem.dispatcher()Lscala/concurrent/ExecutionContext - posted by Srinivas Chamarthi <sr...@gmail.com> on 2014/11/10 21:06:54 UTC, 1 replies.
- Spray with Spark-sql build fails with Incompatible dependencies - posted by Srinivas Chamarthi <sr...@gmail.com> on 2014/11/10 21:40:40 UTC, 0 replies.
- streaming linear regression is not building the model - posted by "Bui, Tri" <Tr...@VerizonWireless.com.INVALID> on 2014/11/10 22:03:36 UTC, 1 replies.
- Spark Master crashes job on task failure - posted by "Griffiths, Michael (NYC-RPM)" <Mi...@reprisemedia.com> on 2014/11/10 22:47:34 UTC, 1 replies.
- convert List to dstream - posted by Josh J <jo...@gmail.com> on 2014/11/10 23:43:01 UTC, 1 replies.
- JavaKafkaWordCount not working under Spark Streaming - posted by Something Something <ma...@gmail.com> on 2014/11/11 00:01:53 UTC, 4 replies.
- Building spark from source - assertion failed: org.eclipse.jetty.server.DispatcherType - posted by jamborta <ja...@gmail.com> on 2014/11/11 00:26:17 UTC, 2 replies.
- thrift jdbc server probably running queries as hive query - posted by Sadhan Sood <sa...@gmail.com> on 2014/11/11 01:29:27 UTC, 3 replies.
- Question about RDD Union and SubtractByKey - posted by Darin McBeath <dd...@yahoo.com.INVALID> on 2014/11/11 02:43:21 UTC, 0 replies.
- inconsistent edge counts in GraphX - posted by "Buttler, David" <bu...@llnl.gov> on 2014/11/11 02:51:43 UTC, 1 replies.
- Checkpoint bugs in GraphX - posted by Xu Lijie <li...@gmail.com> on 2014/11/11 03:19:03 UTC, 1 replies.
- Discuss how to do checkpoint more efficently - posted by Xu Lijie <li...@gmail.com> on 2014/11/11 04:32:12 UTC, 0 replies.
- Strange behavior of spark-shell while accessing hdfs - posted by hmxxyy <hm...@gmail.com> on 2014/11/11 06:04:47 UTC, 4 replies.
- Re: how to use JNI in spark? - posted by tangweihan <ta...@360.cn> on 2014/11/11 07:21:12 UTC, 0 replies.
- How to change the default limiter for textFile function - posted by Blind Faith <pe...@gmail.com> on 2014/11/11 09:36:38 UTC, 0 replies.
- Is there any benchmark suite for spark? - posted by Hu Liu <hl...@pivotal.io> on 2014/11/11 10:25:10 UTC, 0 replies.
- replace ConnectionManager#ackTimeoutMonitor with ScheduledExecutorService to avoid OOME under long timeout - posted by "haitao .yao" <ya...@gmail.com> on 2014/11/11 10:50:42 UTC, 0 replies.
- Cassandra spark connector exception: "NoSuchMethodError: com.google.common.collect.Sets.newConcurrentHashSet()Ljava/util/Set;" - posted by shahab <sh...@gmail.com> on 2014/11/11 12:13:15 UTC, 2 replies.
- JdbcRDD and ClassTag issue - posted by nitinkalra2000 <ni...@gmail.com> on 2014/11/11 12:45:15 UTC, 0 replies.
- java.lang.NoSuchMethodError: twitter4j.TwitterStream.addListener - posted by "Jishnu Menath Prathap (WT01 - BAS)" <ji...@wipro.com> on 2014/11/11 13:18:29 UTC, 3 replies.
- Where can I find logs from workers PySpark - posted by ja...@centrum.cz on 2014/11/11 13:36:43 UTC, 0 replies.
- save as file - posted by Naveen Kumar Pokala <np...@spcapitaliq.com> on 2014/11/11 13:58:13 UTC, 2 replies.
- Spark-submit and Windows / Linux mixed network - posted by Ashic Mahtab <as...@live.com> on 2014/11/11 14:14:11 UTC, 2 replies.
- Best practice for multi-user web controller in front of Spark - posted by bethesda <sw...@mac.com> on 2014/11/11 14:50:22 UTC, 4 replies.
- subscribe - posted by DAVID SWEARINGEN <sw...@mac.com> on 2014/11/11 14:51:03 UTC, 0 replies.
- How to kill a Spark job running in cluster mode ? - posted by Tao Xiao <xi...@gmail.com> on 2014/11/11 14:58:32 UTC, 4 replies.
- Re: Spark + Tableau - posted by Bojan Kostic <bl...@gmail.com> on 2014/11/11 15:38:35 UTC, 0 replies.
- Re: Broadcast failure with variable size of ~ 500mb with "key already cancelled ?" - posted by Tom Seddon <mr...@gmail.com> on 2014/11/11 15:39:57 UTC, 1 replies.
- Re: ERROR ConnectionManager: Corresponding SendingConnection to ConnectionManagerId - posted by Tom Seddon <mr...@gmail.com> on 2014/11/11 16:17:22 UTC, 0 replies.
- scala.MatchError - posted by Naveen Kumar Pokala <np...@spcapitaliq.com> on 2014/11/11 16:18:00 UTC, 3 replies.
- Combining data from two tables in two databases postgresql, JdbcRDD. - posted by akshayhazari <ak...@gmail.com> on 2014/11/11 16:24:18 UTC, 0 replies.
- Spark and Play - posted by Akshat Aranya <aa...@gmail.com> on 2014/11/11 17:21:48 UTC, 5 replies.
- What should be the number of partitions after a union and a subtractByKey - posted by Darin McBeath <dd...@yahoo.com.INVALID> on 2014/11/11 17:40:58 UTC, 0 replies.
- data locality, task distribution - posted by Nathan Kronenfeld <nk...@oculusinfo.com> on 2014/11/11 17:58:41 UTC, 5 replies.
- Re: How to execute a function from class in distributed jar on each worker node? - posted by aaronjosephs <aa...@placeiq.com> on 2014/11/11 20:00:25 UTC, 0 replies.
- S3 table to spark sql - posted by Franco Barrientos <fr...@exalitica.com> on 2014/11/11 20:11:09 UTC, 1 replies.
- pyspark get column family and qualifier names from hbase table - posted by freedafeng <fr...@yahoo.com> on 2014/11/11 20:31:34 UTC, 6 replies.
- filtering out non English tweets using TwitterUtils - posted by SK <sk...@gmail.com> on 2014/11/11 20:41:02 UTC, 5 replies.
- Failed jobs showing as SUCCEEDED on web UI - posted by Brett Meyer <Br...@crowdstrike.com> on 2014/11/11 20:47:15 UTC, 0 replies.
- failed to create a table with python (single node) - posted by "Pagliari, Roberto" <rp...@appcomsci.com> on 2014/11/11 21:19:59 UTC, 0 replies.
- groupBy for DStream - posted by SK <sk...@gmail.com> on 2014/11/11 22:19:53 UTC, 2 replies.
- Help with processing multiple RDDs - posted by akhandeshi <am...@gmail.com> on 2014/11/11 23:13:47 UTC, 4 replies.
- concat two Dstreams - posted by Josh J <jo...@gmail.com> on 2014/11/11 23:41:52 UTC, 2 replies.
- Partition caching taking too long - posted by Sadhan Sood <sa...@gmail.com> on 2014/11/12 00:38:10 UTC, 0 replies.
- Converting Apache log string into map using delimiter - posted by YaoPau <jo...@gmail.com> on 2014/11/12 00:59:35 UTC, 2 replies.
- in function prototypes? - posted by spr <sp...@yarcdata.com> on 2014/11/12 01:13:18 UTC, 0 replies.
- "overloaded method value updateStateByKey ... cannot be applied to ..." when Key is a Tuple2 - posted by spr <sp...@yarcdata.com> on 2014/11/12 01:26:13 UTC, 5 replies.
- SVMWithSGD default threshold - posted by Caron <ca...@gmail.com> on 2014/11/12 01:41:40 UTC, 4 replies.
- Re: how to create a Graph in GraphX? - posted by ankurdave <an...@gmail.com> on 2014/11/12 01:42:40 UTC, 0 replies.
- Does spark can't work with HBase? - posted by gzlj <li...@sinobest.cn> on 2014/11/12 02:10:06 UTC, 0 replies.
- Imbalanced shuffle read - posted by ankits <an...@gmail.com> on 2014/11/12 02:15:21 UTC, 4 replies.
- ISpark class not found - posted by "Laird, Benjamin" <Be...@capitalone.com> on 2014/11/12 02:21:21 UTC, 2 replies.
- MLLIB usage: BLAS dependency warning - posted by jpl <jl...@soe.ucsc.edu> on 2014/11/12 04:11:30 UTC, 5 replies.
- Pyspark Error when broadcast numpy array - posted by bliuab <bl...@cse.ust.hk> on 2014/11/12 04:47:00 UTC, 5 replies.
- External table partitioned by date using Spark SQL - posted by ehalpern <er...@gmail.com> on 2014/11/12 04:51:34 UTC, 1 replies.
- How to solve this core dump error - posted by shiwentao <19...@qq.com> on 2014/11/12 05:19:08 UTC, 0 replies.
- Read a HDFS file from Spark source code - posted by rapelly kartheek <ka...@gmail.com> on 2014/11/12 05:42:32 UTC, 2 replies.
- spark-shell exception while running in YARN mode - posted by hmxxyy <hm...@gmail.com> on 2014/11/12 06:47:03 UTC, 1 replies.
- Is there a way to clone a JavaRDD without persisting it - posted by Steve Lewis <lo...@gmail.com> on 2014/11/12 07:23:29 UTC, 1 replies.
- How did the RDD.union work - posted by qiaou <qi...@gmail.com> on 2014/11/12 07:31:13 UTC, 4 replies.
- 回复: How did the RDD.union work - posted by qiaou <qi...@gmail.com> on 2014/11/12 07:53:32 UTC, 2 replies.
- spark sql - save to Parquet file - Unsupported datatype TimestampType - posted by tridib <tr...@live.com> on 2014/11/12 07:54:28 UTC, 0 replies.
- About Join operator in PySpark - posted by 夏俊鸾 <xi...@gmail.com> on 2014/11/12 08:31:19 UTC, 0 replies.
- Task time measurement - posted by Romi Kuntsman <ro...@totango.com> on 2014/11/12 08:42:07 UTC, 0 replies.
- Best way of transforming stack traces - posted by Kevin Kilroy <ke...@gmail.com> on 2014/11/12 10:07:22 UTC, 0 replies.
- Nested Complex Type Data Parsing and Transforming to table - posted by lu...@sina.com on 2014/11/12 10:38:58 UTC, 1 replies.
- Pass RDD to functions - posted by Deep Pradhan <pr...@gmail.com> on 2014/11/12 10:54:22 UTC, 2 replies.
- snappy error - posted by Naveen Kumar Pokala <np...@spcapitaliq.com> on 2014/11/12 10:58:37 UTC, 2 replies.
- Re: Scala vs Python performance differences - posted by Andrew Ash <an...@andrewash.com> on 2014/11/12 11:12:37 UTC, 1 replies.
- Spark and insertion into RDBMS/NoSQL - posted by nitinkalra2000 <ni...@gmail.com> on 2014/11/12 11:55:27 UTC, 0 replies.
- Number of partitions in RDD for input DStreams - posted by Juan Rodríguez Hortalá <ju...@gmail.com> on 2014/11/12 12:11:53 UTC, 0 replies.
- Spark SQL configurations - posted by Naveen Kumar Pokala <np...@spcapitaliq.com> on 2014/11/12 12:44:42 UTC, 2 replies.
- Java client connection - posted by Eduardo Cusa <ed...@usmediaconsulting.com> on 2014/11/12 15:09:01 UTC, 0 replies.
- why flatmap has shuffle - posted by qinwei <we...@dewmobile.net> on 2014/11/12 15:18:24 UTC, 0 replies.
- Snappy error with Spark SQL - posted by Naveen Kumar Pokala <np...@spcapitaliq.com> on 2014/11/12 15:28:42 UTC, 1 replies.
- Getting py4j.protocol.Py4JError: An error occurred while calling o39.predict. while doing batch prediction using decision trees - posted by rprabhu <rp...@ufl.edu> on 2014/11/12 16:20:36 UTC, 0 replies.
- join 2 tables - posted by Franco Barrientos <fr...@exalitica.com> on 2014/11/12 17:57:59 UTC, 1 replies.
- Too many failed collects when trying to cache a table in SparkSQL - posted by Sadhan Sood <sa...@gmail.com> on 2014/11/12 18:31:12 UTC, 2 replies.
- No module named pyspark - latest built - posted by jamborta <ja...@gmail.com> on 2014/11/12 18:44:44 UTC, 7 replies.
- Re: Getting py4j.protocol.Py4JError: An error occurred while calling o39.predict. while doing batch prediction using decision trees - posted by Davies Liu <da...@databricks.com> on 2014/11/12 19:16:31 UTC, 1 replies.
- using RDD result in another TDD - posted by Adrian Mocanu <am...@verticalscope.com> on 2014/11/12 19:41:47 UTC, 1 replies.
- Reading from Hbase using python - posted by Alan Prando <al...@scanboo.com.br> on 2014/11/12 20:32:49 UTC, 3 replies.
- Wildly varying "aggregate" performance depending on code location - posted by Jim Carroll <ji...@gmail.com> on 2014/11/12 20:34:07 UTC, 1 replies.
- Building spark targz - posted by Ashwin Shankar <as...@gmail.com> on 2014/11/12 21:14:08 UTC, 5 replies.
- How can my java code executing on a slave find the task id? - posted by Steve Lewis <lo...@gmail.com> on 2014/11/12 22:19:49 UTC, 0 replies.
- ec2 script and SPARK_LOCAL_DIRS not created - posted by Darin McBeath <dd...@yahoo.com.INVALID> on 2014/11/12 22:35:48 UTC, 0 replies.
- Spark SQL Lazy Schema Evaluation - posted by Corey Nolet <cj...@gmail.com> on 2014/11/12 23:05:10 UTC, 1 replies.
- How (in Java) do I create an Accumulator of type Long - posted by Steve Lewis <lo...@gmail.com> on 2014/11/13 00:05:24 UTC, 3 replies.
- Cache sparkSql data without uncompressing it in memory - posted by Sadhan Sood <sa...@gmail.com> on 2014/11/13 00:16:32 UTC, 5 replies.
- spark.parallelize seems broken on type - posted by mod0 <mb...@uwaterloo.ca> on 2014/11/13 01:33:21 UTC, 0 replies.
- Map output statuses exceeds frameSize - posted by pouryas <po...@adbrain.com> on 2014/11/13 01:36:47 UTC, 1 replies.
- Spark streaming cannot receive any message from Kafka - posted by Bill Jay <bi...@gmail.com> on 2014/11/13 01:39:00 UTC, 9 replies.
- Using data in RDD to specify HDFS directory to write to - posted by jschindler <jo...@utexas.edu> on 2014/11/13 01:57:15 UTC, 5 replies.
- Cannot summit Spark app to cluster, stuck on “UNDEFINED” - posted by brother rain <br...@gmail.com> on 2014/11/13 02:18:19 UTC, 0 replies.
- flatMap followed by mapPartitions - posted by Debasish Das <de...@gmail.com> on 2014/11/13 03:01:29 UTC, 2 replies.
- Re: Unit testing jar request - posted by nightwolf <ni...@gmail.com> on 2014/11/13 04:04:42 UTC, 0 replies.
- Assigning input files to spark partitions - posted by Pala M Muthaia <mc...@rocketfuelinc.com> on 2014/11/13 04:27:12 UTC, 7 replies.
- Query from two or more tables Spark Sql .I have done this . Is there any simpler solution. - posted by akshayhazari <ak...@gmail.com> on 2014/11/13 06:12:20 UTC, 0 replies.
- Can spark read and write to cassandra without HDFS? - posted by Kevin Burton <bu...@spinn3r.com> on 2014/11/13 06:28:44 UTC, 2 replies.
- Re: RDD to DStream - posted by Jianshi Huang <ji...@gmail.com> on 2014/11/13 06:51:28 UTC, 0 replies.
- Joined RDD - posted by ajay garg <aj...@mobileum.com> on 2014/11/13 07:56:07 UTC, 3 replies.
- Saving RDD into DB & then Reading back from DB - posted by nsareen <ns...@gmail.com> on 2014/11/13 08:07:58 UTC, 0 replies.
- StreamingContext does not stop - posted by Tobias Pfeiffer <tg...@preferred.jp> on 2014/11/13 09:10:06 UTC, 1 replies.
- basic twitter stream program not working. - posted by ji...@wipro.com on 2014/11/13 10:28:25 UTC, 2 replies.
- unable to run streaming - posted by Niko Gamulin <ni...@gmail.com> on 2014/11/13 11:06:23 UTC, 4 replies.
- runexample TwitterPopularTags showing Class Not found error - posted by ji...@wipro.com on 2014/11/13 12:32:53 UTC, 0 replies.
- Which function in spark is used to combine two RDDs by keys - posted by Blind Faith <pe...@gmail.com> on 2014/11/13 12:41:01 UTC, 2 replies.
- Re: runexample TwitterPopularTags showing Class Not found error - posted by Akhil Das <ak...@sigmoidanalytics.com> on 2014/11/13 13:24:40 UTC, 0 replies.
- Spark GCLIB error - posted by Naveen Kumar Pokala <np...@spcapitaliq.com> on 2014/11/13 13:32:20 UTC, 0 replies.
- minimizing disk I/O - posted by rok <ro...@gmail.com> on 2014/11/13 14:56:00 UTC, 0 replies.
- Kafka examples - posted by Eduardo Costa Alfaia <e....@unibs.it> on 2014/11/13 15:25:34 UTC, 0 replies.
- Building a hash table from a csv file using yarn-cluster, and giving it to each executor - posted by YaoPau <jo...@gmail.com> on 2014/11/13 16:34:37 UTC, 1 replies.
- Does Spark Streaming calculate during a batch? - posted by Michael Campbell <mi...@gmail.com> on 2014/11/13 16:35:15 UTC, 3 replies.
- Re: Spark/HIVE Insert Into values Error - posted by Vasu C <va...@gmail.com> on 2014/11/13 17:15:29 UTC, 0 replies.
- how to convert System.currentTimeMillis to calendar time - posted by spr <sp...@yarcdata.com> on 2014/11/13 17:47:55 UTC, 2 replies.
- suggest pyspark using 'with' for sparkcontext to be more 'pythonic' - posted by freedafeng <fr...@yahoo.com> on 2014/11/13 18:36:39 UTC, 0 replies.
- serial data import from master node without leaving spark - posted by aappddeevv <aa...@gmail.com> on 2014/11/13 19:48:42 UTC, 0 replies.
- Accessing RDD within another RDD map - posted by Simone Franzini <ca...@gmail.com> on 2014/11/13 20:28:14 UTC, 1 replies.
- Confused why I'm losing workers/executors when writing a large file to S3 - posted by Darin McBeath <dd...@yahoo.com.INVALID> on 2014/11/13 20:31:45 UTC, 2 replies.
- GraphX / PageRank with edge weights - posted by "Ommen, Jurgen" <om...@stthomas.edu> on 2014/11/13 22:28:52 UTC, 1 replies.
- GraphX: Get edges for a vertex - posted by Daniil Osipov <da...@shazam.com> on 2014/11/13 23:32:15 UTC, 1 replies.
- Spark- How can I run MapReduce only on one partition in an RDD? - posted by Tim Chou <ti...@gmail.com> on 2014/11/13 23:59:46 UTC, 2 replies.
- Spark JDBC Thirft Server over HTTP - posted by vs <vi...@gmail.com> on 2014/11/14 01:21:42 UTC, 1 replies.
- Spark Custom Receiver - posted by Jacob Abraham <ab...@gmail.com> on 2014/11/14 02:24:50 UTC, 3 replies.
- Streaming: getting total count over all windows - posted by SK <sk...@gmail.com> on 2014/11/14 02:28:11 UTC, 2 replies.
- Is there setup and cleanup function in spark? - posted by "Dai, Kevin" <yu...@ebay.com> on 2014/11/14 03:44:02 UTC, 7 replies.
- Using a compression codec in saveAsSequenceFile in Pyspark (Python API) - posted by sahanbull <sa...@skimlinks.com> on 2014/11/14 05:28:25 UTC, 1 replies.
- Communication between Driver and Executors - posted by Tobias Pfeiffer <tg...@preferred.jp> on 2014/11/14 05:58:13 UTC, 4 replies.
- pyspark and hdfs file name - posted by Oleg Ruchovets <or...@gmail.com> on 2014/11/14 06:39:19 UTC, 3 replies.
- toLocalIterator in Spark 1.0.0 - posted by Deep Pradhan <pr...@gmail.com> on 2014/11/14 07:21:40 UTC, 3 replies.
- same error of SPARK-1977 while using trainImplicit in mllib 1.0.2 - posted by aaronlin <aa...@kkbox.com> on 2014/11/14 09:41:15 UTC, 2 replies.
- Spark Memory Hungry? - posted by TJ Klein <TJ...@gmail.com> on 2014/11/14 09:50:08 UTC, 1 replies.
- EmptyRDD - posted by Deep Pradhan <pr...@gmail.com> on 2014/11/14 11:09:52 UTC, 4 replies.
- saveAsParquetFile throwing exception - posted by "vdiwakar.malladi" <vd...@gmail.com> on 2014/11/14 11:35:12 UTC, 3 replies.
- Spark streaming fault tolerance question - posted by François Garillot <fr...@typesafe.com> on 2014/11/14 11:41:10 UTC, 0 replies.
- 1gb file processing...task doesn't launch on all the node...Unseen exception - posted by Priya Ch <le...@gmail.com> on 2014/11/14 12:47:26 UTC, 3 replies.
- Read a HDFS file from Spark using HDFS API - posted by rapelly kartheek <ka...@gmail.com> on 2014/11/14 16:32:11 UTC, 7 replies.
- Re: Skipping Bad Records in Spark - posted by Gerard Maas <ge...@gmail.com> on 2014/11/14 16:47:50 UTC, 0 replies.
- User Authn and Authz in Spark missing ? - posted by Zeeshan Ali Shah <za...@pdc.kth.se> on 2014/11/14 16:50:28 UTC, 1 replies.
- Set worker log configuration when running "local[n]" - posted by Jim Carroll <ji...@gmail.com> on 2014/11/14 17:06:35 UTC, 2 replies.
- Declaring multiple RDDs and efficiency concerns - posted by Simone Franzini <ca...@gmail.com> on 2014/11/14 17:31:31 UTC, 2 replies.
- How do I turn off Parquet logging in a worker? - posted by Jim Carroll <ji...@gmail.com> on 2014/11/14 18:37:37 UTC, 3 replies.
- Given multiple .filter()'s, is there a way to set the order? - posted by YaoPau <jo...@gmail.com> on 2014/11/14 19:20:47 UTC, 1 replies.
- saveAsTextFile error - posted by Niko Gamulin <ni...@gmail.com> on 2014/11/14 19:39:27 UTC, 2 replies.
- Adaptive stream processing and dynamic batch sizing - posted by Josh J <jo...@gmail.com> on 2014/11/14 19:42:38 UTC, 1 replies.
- Compiling Spark master HEAD failed. - posted by Jianshi Huang <ji...@gmail.com> on 2014/11/14 20:08:25 UTC, 0 replies.
- Cancelled Key Exceptions on Massive Join - posted by "Ganelin, Ilya" <Il...@capitalone.com> on 2014/11/14 20:16:10 UTC, 0 replies.
- How do you force a Spark Application to run in multiple tasks - posted by Steve Lewis <lo...@gmail.com> on 2014/11/14 20:18:43 UTC, 3 replies.
- Elastic allocation(spark.dynamicAllocation.enabled) results in task never being executed. - posted by Egor Pahomov <pa...@gmail.com> on 2014/11/14 20:32:51 UTC, 4 replies.
- Submitting Python Applications from Remote to Master - posted by Benjamin Zaitlen <qu...@gmail.com> on 2014/11/14 20:40:43 UTC, 3 replies.
- Kryo serialization in examples.streaming.TwitterAlgebirdCMS/HLL - posted by Debasish Das <de...@gmail.com> on 2014/11/14 21:57:58 UTC, 0 replies.
- Mulitple Spark Context - posted by Charles <ch...@cenx.com> on 2014/11/14 21:58:12 UTC, 3 replies.
- SparkSQL exception on cached parquet table - posted by Sadhan Sood <sa...@gmail.com> on 2014/11/14 22:28:38 UTC, 10 replies.
- Sourcing data from RedShift - posted by Gary Malouf <ma...@gmail.com> on 2014/11/14 23:19:28 UTC, 4 replies.
- Client application that calls Spark and receives an MLlib model Scala Object and then predicts without Spark installed on hadoop - posted by xiaoyan yu <xi...@gmail.com> on 2014/11/14 23:54:42 UTC, 1 replies.
- filtering a SchemaRDD - posted by "Daniel, Ronald (ELS-SDG)" <R....@elsevier.com> on 2014/11/15 06:22:12 UTC, 3 replies.
- Help with Spark Streaming - posted by Bahubali Jain <ba...@gmail.com> on 2014/11/15 12:18:33 UTC, 2 replies.
- repartition combined with zipWithIndex get stuck - posted by lev <ka...@gmail.com> on 2014/11/15 12:27:41 UTC, 3 replies.
- using zip gets EOFError error - posted by chocjy <ji...@gmail.com> on 2014/11/15 19:06:49 UTC, 0 replies.
- Pagerank implementation - posted by tom85 <to...@gmail.com> on 2014/11/16 02:01:22 UTC, 2 replies.
- How to incrementally compile spark examples using mvn - posted by "Yiming (John) Zhang" <sd...@gmail.com> on 2014/11/16 02:31:54 UTC, 8 replies.
- SparkSQL exception on spark.sql.codegen - posted by Eric Zhen <zh...@gmail.com> on 2014/11/16 07:25:29 UTC, 6 replies.
- Re: Spark Streaming Application Got killed after 2 hours - posted by Prannoy <pr...@sigmoidanalytics.com> on 2014/11/16 09:27:13 UTC, 0 replies.
- How to kill/upgrade/restart driver launched in Spark standalone cluster+supervised mode? - posted by Jesper Lundgren <ko...@gmail.com> on 2014/11/16 09:46:13 UTC, 0 replies.
- Re: Cancelled Key Exceptions on Massive Join - posted by Akhil Das <ak...@sigmoidanalytics.com> on 2014/11/16 18:56:11 UTC, 0 replies.
- Returning breeze.linalg.DenseMatrix from method - posted by Ritesh Kumar Singh <ri...@gmail.com> on 2014/11/17 00:14:53 UTC, 2 replies.
- Interoperability between ScalaRDD, JavaRDD and PythonRDD - posted by Nam Nguyen <bi...@gmail.com> on 2014/11/17 01:26:45 UTC, 0 replies.
- Iterative changes to RDD and broadcast variables - posted by Shannon Quinn <sq...@gatech.edu> on 2014/11/17 03:32:28 UTC, 0 replies.
- RDD.aggregate versus accumulables... - posted by "Segerlind, Nathan L" <na...@intel.com> on 2014/11/17 04:06:39 UTC, 4 replies.
- Load json format dataset as RDD - posted by J <jo...@gmail.com> on 2014/11/17 04:34:39 UTC, 2 replies.
- Functions in Spark - posted by Deep Pradhan <pr...@gmail.com> on 2014/11/17 05:13:53 UTC, 3 replies.
- spark-submit question - posted by Samarth Mailinglist <ma...@gmail.com> on 2014/11/17 05:58:53 UTC, 2 replies.
- Questions Regarding to MPI Program Migration to Spark - posted by Jun Yang <ya...@gmail.com> on 2014/11/17 07:18:19 UTC, 0 replies.
- RandomGenerator class not found exception - posted by Ritesh Kumar Singh <ri...@gmail.com> on 2014/11/17 09:24:24 UTC, 4 replies.
- Landmarks in GraphX section of Spark API - posted by Deep Pradhan <pr...@gmail.com> on 2014/11/17 10:17:50 UTC, 7 replies.
- Spark streaming batch overrun - posted by facboy <cn...@gmail.com> on 2014/11/17 10:44:56 UTC, 0 replies.
- HDFS read text file - posted by Naveen Kumar Pokala <np...@spcapitaliq.com> on 2014/11/17 11:33:44 UTC, 2 replies.
- Building Spark for Hive The requested profile "hadoop-1.2" could not be activated because it does not exist. - posted by akshayhazari <ak...@gmail.com> on 2014/11/17 11:40:19 UTC, 1 replies.
- How to measure communication between nodes in Spark Standalone Cluster? - posted by Hlib Mykhailenko <hl...@inria.fr> on 2014/11/17 11:59:24 UTC, 2 replies.
- Building Spark with hive does not work - posted by Hao Ren <in...@gmail.com> on 2014/11/17 15:02:32 UTC, 6 replies.
- How to broadcast a textFile? - posted by YaoPau <jo...@gmail.com> on 2014/11/17 18:08:36 UTC, 2 replies.
- How can I apply such an inner join in Spark Scala/Python - posted by Blind Faith <pe...@gmail.com> on 2014/11/17 18:51:11 UTC, 2 replies.
- Exception in spark sql when running a group by query - posted by Sadhan Sood <sa...@gmail.com> on 2014/11/17 18:58:11 UTC, 2 replies.
- How do I get the executor ID from running Java code - posted by Steve Lewis <lo...@gmail.com> on 2014/11/17 18:59:24 UTC, 0 replies.
- Spark streaming on Yarn - posted by kpeng1 <kp...@gmail.com> on 2014/11/17 19:20:47 UTC, 0 replies.
- Re: java.lang.OutOfMemoryError: Requested array size exceeds VM limit - posted by akhandeshi <am...@gmail.com> on 2014/11/17 19:31:09 UTC, 0 replies.
- Re: Missing SparkSQLCLIDriver and Beeline drivers in Spark - posted by Ted Yu <yu...@gmail.com> on 2014/11/17 19:34:37 UTC, 0 replies.
- IOException: exception in uploadSinglePart - posted by Justin Mills <vo...@gmail.com> on 2014/11/17 19:37:15 UTC, 0 replies.
- independent user sessions with a multi-user spark sql thriftserver (Spark 1.1) - posted by Michael Allman <mi...@videoamp.com> on 2014/11/17 20:01:54 UTC, 1 replies.
- RDD Blocks skewing to just few executors - posted by mtimper <mi...@timper.com> on 2014/11/18 01:40:31 UTC, 0 replies.
- java.lang.ArithmeticException while create Parquet - posted by "Jahagirdar, Madhu" <ma...@philips.com> on 2014/11/18 03:05:03 UTC, 0 replies.
- Running PageRank in GraphX - posted by Deep Pradhan <pr...@gmail.com> on 2014/11/18 07:32:52 UTC, 2 replies.
- Null pointer exception with larger datasets - posted by Naveen Kumar Pokala <np...@spcapitaliq.com> on 2014/11/18 07:37:09 UTC, 2 replies.
- Probability in Naive Bayes - posted by Samarth Mailinglist <ma...@gmail.com> on 2014/11/18 07:40:50 UTC, 1 replies.
- Is it safe to use Scala 2.11 for Spark build? - posted by Jianshi Huang <ji...@gmail.com> on 2014/11/18 07:49:08 UTC, 4 replies.
- Re: Check your cluster UI to ensure that workers are registered and have sufficient memory - posted by lin_qili <li...@outlook.com> on 2014/11/18 08:37:28 UTC, 0 replies.
- Spark On Yarn Issue: Initial job has not accepted any resources - posted by LinCharlie <li...@outlook.com> on 2014/11/18 08:53:37 UTC, 1 replies.
- Logging problem in Spark when using Flume Log4jAppender - posted by QiaoanChen <ka...@gmail.com> on 2014/11/18 09:12:40 UTC, 0 replies.
- how to know the Spark worker Mechanism - posted by tangweihan <ta...@360.cn> on 2014/11/18 09:13:11 UTC, 2 replies.
- Slave Node Management in Standalone Cluster - posted by Kenichi Maehashi <we...@kenichimaehashi.com> on 2014/11/18 09:27:13 UTC, 3 replies.
- [Spark/ Spark Streaming] Spark 1.1.0 fails working with akka 2.3.6 - posted by Sourav Chandra <so...@livestream.com> on 2014/11/18 09:54:26 UTC, 0 replies.
- New Codes in GraphX - posted by Deep Pradhan <pr...@gmail.com> on 2014/11/18 10:21:54 UTC, 7 replies.
- Spark Streaming with Kafka is failing with Error - posted by Sourav Chandra <so...@livestream.com> on 2014/11/18 11:11:47 UTC, 1 replies.
- Kestrel and Spark Stream - posted by Eduardo Alfaia <e....@unibs.it> on 2014/11/18 11:53:24 UTC, 2 replies.
- high number of int[], byte[], MutableBigInteger in spark job - posted by Rajat Verma <ve...@gmail.com> on 2014/11/18 13:11:02 UTC, 0 replies.
- How to assign consecutive numeric id to each row based on its content? - posted by shahab <sh...@gmail.com> on 2014/11/18 13:54:34 UTC, 3 replies.
- ReduceByKey but with different functions depending on key - posted by jelgh <jo...@gmail.com> on 2014/11/18 13:59:23 UTC, 3 replies.
- Getting spark job progress programmatically - posted by Aniket Bhatnagar <an...@gmail.com> on 2014/11/18 14:43:34 UTC, 7 replies.
- Is sorting persisted after pair rdd transformations? - posted by Aniket Bhatnagar <an...@gmail.com> on 2014/11/18 14:56:29 UTC, 5 replies.
- Pyspark Error - posted by amin mohebbi <am...@yahoo.com.INVALID> on 2014/11/18 15:10:29 UTC, 2 replies.
- sum/avg group by specified ranges - posted by tridib <tr...@live.com> on 2014/11/18 16:09:39 UTC, 0 replies.
- Nightly releases - posted by Arun Ahuja <aa...@gmail.com> on 2014/11/18 16:21:17 UTC, 5 replies.
- Is there a way to create key based on counts in Spark - posted by Blind Faith <pe...@gmail.com> on 2014/11/18 17:56:30 UTC, 4 replies.
- NotSerializableException caused by Implicit sparkContext or sparkStreamingContext, why? - posted by Jianshi Huang <ji...@gmail.com> on 2014/11/18 17:57:01 UTC, 0 replies.
- JavaKafkaWordCount - posted by Eduardo Costa Alfaia <e....@unibs.it> on 2014/11/18 18:04:02 UTC, 0 replies.
- RDD needs several JOINs and COUNTs ... how do I optimize? - posted by YaoPau <jo...@gmail.com> on 2014/11/18 18:26:16 UTC, 0 replies.
- Spark on YARN - posted by Alan Prando <al...@scanboo.com.br> on 2014/11/18 19:03:33 UTC, 6 replies.
- Iterative transformations over RDD crashes in phantom reduce - posted by Shannon Quinn <sq...@gatech.edu> on 2014/11/18 19:58:55 UTC, 2 replies.
- Problems launching 1.2.0-SNAPSHOT cluster with Hive support on EC2 - posted by curtkohler <c....@elsevier.com> on 2014/11/18 22:01:01 UTC, 0 replies.
- GraphX twitter - posted by tom85 <to...@gmail.com> on 2014/11/18 22:29:52 UTC, 0 replies.
- Lost executors - posted by Pala M Muthaia <mc...@rocketfuelinc.com> on 2014/11/18 22:54:01 UTC, 3 replies.
- spark-shell giving me error of unread block data - posted by Anson Abraham <an...@gmail.com> on 2014/11/18 22:59:43 UTC, 10 replies.
- Cores on Master - posted by Pat Ferrel <pa...@occamsmachete.com> on 2014/11/19 00:14:39 UTC, 4 replies.
- JdbcRDD - posted by Krishna <re...@gmail.com> on 2014/11/19 00:56:14 UTC, 2 replies.
- Spark streaming: java.io.IOException: Version Mismatch (Expected: 28, Received: 18245 ) - posted by Bill Jay <bi...@gmail.com> on 2014/11/19 01:37:35 UTC, 0 replies.
- Parsing a large XML file using Spark - posted by Soumya Simanta <so...@gmail.com> on 2014/11/19 01:54:58 UTC, 5 replies.
- Converting a json struct to map - posted by Daniel Haviv <da...@gmail.com> on 2014/11/19 06:31:10 UTC, 4 replies.
- k-means clustering - posted by amin mohebbi <am...@yahoo.com.INVALID> on 2014/11/19 07:21:17 UTC, 2 replies.
- SparkSQL and Hive/Hive metastore testing - LocalHiveContext - posted by Night Wolf <ni...@gmail.com> on 2014/11/19 07:34:37 UTC, 1 replies.
- A partitionBy problem - posted by Tao Xiao <xi...@gmail.com> on 2014/11/19 08:06:53 UTC, 0 replies.
- Merging Parquet Files - posted by Daniel Haviv <da...@gmail.com> on 2014/11/19 09:41:56 UTC, 7 replies.
- Getting Parts of Iterables in Function's call method - posted by jelgh <jo...@gmail.com> on 2014/11/19 10:33:36 UTC, 0 replies.
- How to apply schema to queried data from Hive before saving it as parquet file? - posted by akshayhazari <ak...@gmail.com> on 2014/11/19 10:34:47 UTC, 4 replies.
- RE: Spark to eliminate full-table scan latency - posted by bchazalet <bc...@companywatch.net> on 2014/11/19 11:03:31 UTC, 0 replies.
- Why is ALS class serializable ? - posted by Hao Ren <in...@gmail.com> on 2014/11/19 11:39:55 UTC, 2 replies.
- Efficient way to split an input data set into different output files - posted by Tom Seddon <mr...@gmail.com> on 2014/11/19 12:39:57 UTC, 1 replies.
- Debugging spark java application - posted by Mukesh Jha <mu...@gmail.com> on 2014/11/19 15:28:11 UTC, 1 replies.
- GraphX bug re-opened - posted by Gary Malouf <ma...@gmail.com> on 2014/11/19 15:30:19 UTC, 0 replies.
- Cannot access data after a join (error: value _1 is not a member of Product with Serializable) - posted by YaoPau <jo...@gmail.com> on 2014/11/19 16:23:05 UTC, 2 replies.
- "can not found scala.reflect related methods" when running spark program - posted by Dingfei Zhang <zh...@ict.ac.cn> on 2014/11/19 16:24:57 UTC, 0 replies.
- Spark Streaming with Flume or Kafka? - posted by Guillermo Ortiz <ko...@gmail.com> on 2014/11/19 17:10:12 UTC, 6 replies.
- tableau spark sql cassandra - posted by jererc <je...@gmail.com> on 2014/11/19 17:40:03 UTC, 8 replies.
- [SQL] HiveThriftServer2 failure detection - posted by Yana Kadiyska <ya...@gmail.com> on 2014/11/19 18:19:19 UTC, 2 replies.
- Shuffle Intensive Job: sendMessageReliably failed because ack was not received within 60 sec - posted by Gary Malouf <ma...@gmail.com> on 2014/11/19 19:50:06 UTC, 1 replies.
- Re: spark streaming and the spark shell - posted by Tian Zhang <tz...@yahoo.com> on 2014/11/19 20:41:23 UTC, 0 replies.
- querying data from Cassandra through the Spark SQL Thrift JDBC server - posted by Mohammed Guller <mo...@glassbeam.com> on 2014/11/19 21:11:22 UTC, 0 replies.
- rack-topology.sh no such file or directory - posted by Arun Luthra <ar...@gmail.com> on 2014/11/19 21:13:53 UTC, 2 replies.
- NEW to spark and sparksql - posted by Sam Flint <sa...@magnetic.com> on 2014/11/19 22:02:51 UTC, 4 replies.
- Can we make EdgeRDD and VertexRDD storage level to MEMORY_AND_DISK? - posted by Harihar Nahak <hn...@wynyardgroup.com> on 2014/11/19 22:29:24 UTC, 1 replies.
- How to get list of edges between two Vertex ? - posted by Harihar Nahak <hn...@wynyardgroup.com> on 2014/11/19 22:32:57 UTC, 0 replies.
- Reading nested JSON data with Spark SQL - posted by Simone Franzini <ca...@gmail.com> on 2014/11/19 22:33:06 UTC, 2 replies.
- [SQL]Proper use of spark.sql.thriftserver.scheduler.pool - posted by Yana Kadiyska <ya...@gmail.com> on 2014/11/19 22:35:33 UTC, 0 replies.
- Re: Strategies for reading large numbers of files - posted by soojin <xa...@yahoo.com> on 2014/11/19 22:52:32 UTC, 0 replies.
- Spark Standalone Scheduling - posted by TJ Klein <TJ...@gmail.com> on 2014/11/19 23:46:57 UTC, 0 replies.
- PairRDDFunctions with Tuple2 subclasses - posted by Daniel Siegmann <da...@velos.io> on 2014/11/20 01:39:32 UTC, 2 replies.
- Joining DStream with static file - posted by YaoPau <jo...@gmail.com> on 2014/11/20 02:13:21 UTC, 1 replies.
- How to view log on yarn-client mode? - posted by innowireless TaeYun Kim <ta...@innowireless.co.kr> on 2014/11/20 03:01:04 UTC, 3 replies.
- insertIntoTable failure deleted pre-existing _metadata file - posted by Daniel Haviv <da...@gmail.com> on 2014/11/20 05:54:41 UTC, 0 replies.
- Spark Streaming not working in YARN mode - posted by kam lee <cl...@gmail.com> on 2014/11/20 06:06:55 UTC, 2 replies.
- Transform RDD.groupBY result to multiple RDDs - posted by "Dai, Kevin" <yu...@ebay.com> on 2014/11/20 06:19:10 UTC, 0 replies.
- Naive Baye's classification confidence - posted by jatinpreet <ja...@gmail.com> on 2014/11/20 06:42:04 UTC, 6 replies.
- Re: Transform RDD.groupBY result to multiple RDDs - posted by Sean Owen <so...@cloudera.com> on 2014/11/20 07:51:13 UTC, 0 replies.
- spark-submit and logging - posted by Tobias Pfeiffer <tg...@preferred.jp> on 2014/11/20 09:20:03 UTC, 4 replies.
- SparkSQL exception handling - posted by Daniel Haviv <da...@gmail.com> on 2014/11/20 09:20:59 UTC, 1 replies.
- RDD Action require data from Another RDD - posted by nsareen <ns...@gmail.com> on 2014/11/20 09:43:53 UTC, 0 replies.
- Re: Optimizing text file parsing, many small files versus few big files - posted by rzykov <rz...@gmail.com> on 2014/11/20 09:51:36 UTC, 1 replies.
- Is it possible to save the streams to one single file? - posted by ji...@wipro.com on 2014/11/20 11:32:20 UTC, 2 replies.
- Error while submitting spark streaming job in YARN - posted by Tapas Swain <ta...@gmail.com> on 2014/11/20 13:02:07 UTC, 0 replies.
- RDD memory and storage level option - posted by Tsai Li Ming <ma...@ltsai.com> on 2014/11/20 13:12:06 UTC, 0 replies.
- Slow performance in spark streaming - posted by Blackeye <bl...@iit.demokritos.gr> on 2014/11/20 14:22:47 UTC, 1 replies.
- MLIB KMeans Exception - posted by Alan Prando <al...@scanboo.com.br> on 2014/11/20 14:57:36 UTC, 1 replies.
- Please help me get started on Apache Spark - posted by Saurabh Agrawal <sa...@markit.com> on 2014/11/20 15:04:01 UTC, 2 replies.
- Spark doesn't kill worker process after failing on Yarn - posted by rzykov <rz...@gmail.com> on 2014/11/20 15:10:09 UTC, 0 replies.
- Spark SQL Exception handling - posted by Daniel Haviv <da...@gmail.com> on 2014/11/20 16:34:44 UTC, 2 replies.
- Best way to store RDD data? - posted by RJ Nowling <rn...@gmail.com> on 2014/11/20 16:35:35 UTC, 1 replies.
- Windows+Pyspark+YARN - posted by Ángel Álvarez Pascua <an...@gmail.com> on 2014/11/20 16:36:42 UTC, 0 replies.
- Spark S3 Performance - posted by Nitay Joffe <ni...@actioniq.co> on 2014/11/20 17:54:54 UTC, 7 replies.
- Incremental loading data slows performance - posted by Gordon Benjamin <go...@gmail.com> on 2014/11/20 18:17:02 UTC, 1 replies.
- ClassNotFoundException in standalone mode - posted by Benoit Pasquereau <Be...@amdocs.com> on 2014/11/20 18:18:12 UTC, 3 replies.
- How can I read this avro file using spark & scala? - posted by al b <be...@googlemail.com> on 2014/11/20 19:19:39 UTC, 5 replies.
- Adding partitions to parquet data - posted by Sadhan Sood <sa...@gmail.com> on 2014/11/20 19:33:28 UTC, 2 replies.
- [GraphX] Mining GeoData (OSM) - posted by andy petrella <an...@gmail.com> on 2014/11/20 20:33:45 UTC, 0 replies.
- How to join two RDDs with mutually exclusive keys - posted by Blind Faith <pe...@gmail.com> on 2014/11/20 21:06:32 UTC, 5 replies.
- Debug Sql execution - posted by Gordon Benjamin <go...@gmail.com> on 2014/11/20 21:11:20 UTC, 1 replies.
- Spark Streaming Metrics - posted by Gerard Maas <ge...@gmail.com> on 2014/11/20 21:25:58 UTC, 2 replies.
- Using TF-IDF from MLlib - posted by "Daniel, Ronald (ELS-SDG)" <R....@elsevier.com> on 2014/11/21 01:10:12 UTC, 3 replies.
- Re: Any issues with repartition? - posted by "Johnson, Dale" <da...@ebay.com> on 2014/11/21 01:16:41 UTC, 0 replies.
- Code works in Spark-Shell but Fails inside IntelliJ - posted by Sanjay Subramanian <sa...@yahoo.com.INVALID> on 2014/11/21 01:35:26 UTC, 4 replies.
- logging in workers for pyspark - posted by freedafeng <fr...@yahoo.com> on 2014/11/21 01:35:26 UTC, 1 replies.
- Spark failing when loading large amount of data - posted by SK <sk...@gmail.com> on 2014/11/21 03:57:46 UTC, 1 replies.
- beeline via spark thrift doesn't retain cache - posted by Judy Nash <ju...@exchange.microsoft.com> on 2014/11/21 04:06:59 UTC, 2 replies.
- MongoDB Bulk Inserts - posted by Benny Thompson <be...@gmail.com> on 2014/11/21 04:18:26 UTC, 3 replies.
- Another accumulator question - posted by Nathan Kronenfeld <nk...@oculusinfo.com> on 2014/11/21 05:46:53 UTC, 4 replies.
- Re: Persist streams to text files - posted by ji...@wipro.com on 2014/11/21 06:53:37 UTC, 5 replies.
- Determine number of running executors - posted by Tobias Pfeiffer <tg...@preferred.jp> on 2014/11/21 08:35:52 UTC, 3 replies.
- Spark serialization issues with third-party libraries - posted by jatinpreet <ja...@gmail.com> on 2014/11/21 08:39:37 UTC, 4 replies.
- processing files - posted by Philippe de Rochambeau <ph...@free.fr> on 2014/11/21 08:46:32 UTC, 1 replies.
- How to get applicationId for yarn mode(both yarn-client and yarn-cluster mode) - posted by Earthson <Ea...@gmail.com> on 2014/11/21 10:12:12 UTC, 1 replies.
- spark code style - posted by Kevin Jung <it...@samsung.com> on 2014/11/21 10:20:39 UTC, 1 replies.
- Is there a way to turn on spark eventLog on the worker node? - posted by Xuelin Cao <xu...@yahoo.com> on 2014/11/21 10:30:09 UTC, 3 replies.
- short-circuit local reads cannot be used - posted by Daniel Haviv <da...@gmail.com> on 2014/11/21 10:47:24 UTC, 0 replies.
- How to deal with BigInt in my case class for RDD => SchemaRDD convertion - posted by Jianshi Huang <ji...@gmail.com> on 2014/11/21 11:11:27 UTC, 2 replies.
- Spark: Simple local test failed depending on memory settings - posted by rzykov <rz...@gmail.com> on 2014/11/21 11:25:48 UTC, 0 replies.
- EC2 cluster with SSD ebs - posted by Hao Ren <in...@gmail.com> on 2014/11/21 11:30:07 UTC, 1 replies.
- Setup Remote HDFS for Spark - posted by EH <ea...@gmail.com> on 2014/11/21 15:28:15 UTC, 4 replies.
- Execute Spark programs from local machine on Yarn-hadoop cluster - posted by Naveen Kumar Pokala <np...@spcapitaliq.com> on 2014/11/21 15:39:04 UTC, 2 replies.
- Re: RDD data checkpoint cleaning - posted by Luis Ángel Vicente Sánchez <la...@gmail.com> on 2014/11/21 16:17:10 UTC, 1 replies.
- Error: Unrecognized option '--conf' (trying to set auto.offset.reset) - posted by YaoPau <jo...@gmail.com> on 2014/11/21 16:58:45 UTC, 2 replies.
- Lots of small input files - posted by Pat Ferrel <pa...@occamsmachete.com> on 2014/11/21 17:19:27 UTC, 2 replies.
- Re: driver memory - posted by Gen <ge...@gmail.com> on 2014/11/21 17:52:58 UTC, 0 replies.
- JVM Memory Woes - posted by pthai <th...@gmail.com> on 2014/11/21 18:44:42 UTC, 1 replies.
- Many retries for Python job - posted by Brett Meyer <Br...@crowdstrike.com> on 2014/11/21 18:47:36 UTC, 2 replies.
- SparkSQL - can we add new column(s) to parquet files - posted by Sadhan Sood <sa...@gmail.com> on 2014/11/21 19:03:53 UTC, 1 replies.
- SparkSQL Timestamp query failure - posted by whitebread <al...@me.com> on 2014/11/21 19:39:01 UTC, 6 replies.
- Extracting values from a Collecion - posted by Sanjay Subramanian <sa...@yahoo.com.INVALID> on 2014/11/21 19:41:38 UTC, 4 replies.
- MLLib: LinearRegressionWithSGD performance - posted by Sameer Tilak <ss...@live.com> on 2014/11/21 20:18:32 UTC, 3 replies.
- Spark SQL with Apache Phoenix lower and upper Bound - posted by Alaa Ali <co...@gmail.com> on 2014/11/21 22:14:11 UTC, 5 replies.
- Running Spark application from Tomcat - posted by Andreas Koch <an...@andreas-koch.dk> on 2014/11/21 23:09:49 UTC, 0 replies.
- Persist kafka streams to text file - posted by Joanne Contact <jo...@gmail.com> on 2014/11/21 23:32:35 UTC, 0 replies.
- Persist kafka streams to text file, tachyon error? - posted by Joanne Contact <jo...@gmail.com> on 2014/11/21 23:48:24 UTC, 1 replies.
- Book: Data Analysis with SparkR - posted by Emaasit <da...@gmail.com> on 2014/11/22 00:48:16 UTC, 1 replies.
- Missing parents for stage (Spark Streaming) - posted by YaoPau <jo...@gmail.com> on 2014/11/22 00:57:23 UTC, 1 replies.
- allocating different memory to different executor for same application - posted by tridib <tr...@live.com> on 2014/11/22 02:34:11 UTC, 0 replies.
- spark-sql broken - posted by tridib <tr...@live.com> on 2014/11/22 03:39:17 UTC, 1 replies.
- latest Spark 1.2 thrift server fail with NoClassDefFoundError on Guava - posted by Judy Nash <ju...@exchange.microsoft.com> on 2014/11/22 04:05:35 UTC, 11 replies.
- Spark streaming job failing after some time. - posted by pankaj channe <pa...@gmail.com> on 2014/11/22 04:09:15 UTC, 4 replies.
- Re: querying data from Cassandra through the Spark SQL Thrift JDBC server - posted by Cheng Lian <li...@gmail.com> on 2014/11/22 12:53:33 UTC, 1 replies.
- Spark or MR, Scala or Java? - posted by Guillermo Ortiz <ko...@gmail.com> on 2014/11/22 16:34:04 UTC, 11 replies.
- Getting exception on JavaSchemaRDD; org.apache.spark.SparkException: Task not serializable - posted by "vdiwakar.malladi" <vd...@gmail.com> on 2014/11/22 17:32:03 UTC, 4 replies.
- RDD with object shared across elements within a partition. Magic number 200? - posted by insperatum <in...@gmail.com> on 2014/11/22 17:56:33 UTC, 1 replies.
- Error when Spark streaming consumes from Kafka - posted by Bill Jay <bi...@gmail.com> on 2014/11/22 21:43:06 UTC, 2 replies.
- SparkBigData.com: The Apache Spark Knowledge Base - posted by Slim Baltagi <sb...@gmail.com> on 2014/11/22 21:51:36 UTC, 0 replies.
- Spark SQL Programming Guide - registerTempTable Error - posted by riginos <sa...@gmail.com> on 2014/11/23 19:59:18 UTC, 5 replies.
- Spark Streaming with Python - posted by "Venkat, Ankam" <An...@centurylink.com> on 2014/11/23 20:04:43 UTC, 2 replies.
- Python Logistic Regression error - posted by "Venkat, Ankam" <An...@centurylink.com> on 2014/11/23 20:38:04 UTC, 1 replies.
- Converting a column to a map - posted by Daniel Haviv <da...@gmail.com> on 2014/11/23 21:01:43 UTC, 1 replies.
- Creating a front-end for output from Spark/PySpark - posted by Alaa Ali <co...@gmail.com> on 2014/11/23 21:37:06 UTC, 2 replies.
- How to insert complex types like map> in spark sql - posted by critikaled <is...@gmail.com> on 2014/11/24 00:00:20 UTC, 7 replies.
- How to keep a local variable in each cluster? - posted by zh8788 <78...@qq.com> on 2014/11/24 02:41:37 UTC, 2 replies.
- wholeTextFiles on 20 nodes - posted by Simon Hafner <re...@gmail.com> on 2014/11/24 03:10:46 UTC, 0 replies.
- Question about resource sharing in Spark Standalone - posted by Patrick Liu <li...@163.com> on 2014/11/24 03:49:38 UTC, 0 replies.
- 2 spark streaming questions - posted by tian zhang <tz...@yahoo.com.INVALID> on 2014/11/24 06:31:28 UTC, 0 replies.
- Issues about running on client in standalone mode - posted by LinQili <li...@outlook.com> on 2014/11/24 10:32:36 UTC, 1 replies.
- Store kmeans model - posted by Jaonary Rabarisoa <ja...@gmail.com> on 2014/11/24 11:21:41 UTC, 1 replies.
- Submit Spark driver on Yarn Cluster in client mode - posted by Naveen Kumar Pokala <np...@spcapitaliq.com> on 2014/11/24 11:31:02 UTC, 3 replies.
- issue while running the code in standalone mode: "Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory" - posted by "vdiwakar.malladi" <vd...@gmail.com> on 2014/11/24 11:43:00 UTC, 3 replies.
- Use case question - posted by Gordon Benjamin <go...@gmail.com> on 2014/11/24 12:04:40 UTC, 4 replies.
- Writing collection to file error - posted by Saurabh Agrawal <sa...@markit.com> on 2014/11/24 12:35:38 UTC, 3 replies.
- spark broadcast error - posted by Ke Wang <jk...@gmail.com> on 2014/11/24 13:44:40 UTC, 0 replies.
- ExternalAppendOnlyMap: Thread spilling in-memory map of to disk many times slowly - posted by Romi Kuntsman <ro...@totango.com> on 2014/11/24 15:20:36 UTC, 1 replies.
- Spark Cassandra Guava version issues - posted by Ashic Mahtab <as...@live.com> on 2014/11/24 15:21:08 UTC, 2 replies.
- Spark SQL (1.0) - posted by david <da...@free.fr> on 2014/11/24 16:13:45 UTC, 0 replies.
- Spark and Stanford CoreNLP - posted by tvas <th...@gmail.com> on 2014/11/24 16:46:04 UTC, 9 replies.
- Connected Components running for a long time and failing eventually - posted by nitinkak001 <ni...@gmail.com> on 2014/11/24 17:19:07 UTC, 0 replies.
- advantages of SparkSQL? - posted by mrm <ma...@skimlinks.com> on 2014/11/24 17:20:47 UTC, 4 replies.
- Mllib native netlib-java/OpenBLAS - posted by agg212 <al...@brown.edu> on 2014/11/24 17:51:33 UTC, 6 replies.
- How does Spark SQL traverse the physical tree? - posted by Tim Chou <ti...@gmail.com> on 2014/11/24 18:52:21 UTC, 1 replies.
- Spark error in execution - posted by Blackeye <bl...@iit.demokritos.gr> on 2014/11/24 19:15:14 UTC, 0 replies.
- Using Spark Context as an attribute of a class cannot be used - posted by aecc <al...@gmail.com> on 2014/11/24 19:15:26 UTC, 8 replies.
- Python Scientific Libraries in Spark - posted by Rohit Pujari <rp...@hortonworks.com> on 2014/11/24 19:56:05 UTC, 1 replies.
- Inaccurate Estimate of weights model from StreamingLinearRegressionWithSGD - posted by "Bui, Tri" <Tr...@VerizonWireless.com.INVALID> on 2014/11/24 20:31:19 UTC, 6 replies.
- Building Yarn mode with sbt - posted by Akshat Aranya <aa...@gmail.com> on 2014/11/24 20:42:03 UTC, 0 replies.
- Unable to use Kryo - posted by Daniel Haviv <da...@gmail.com> on 2014/11/24 22:12:57 UTC, 1 replies.
- Ideas on how to use Spark for anomaly detection on a stream of data - posted by Natu Lauchande <nl...@gmail.com> on 2014/11/24 23:17:08 UTC, 2 replies.
- Spark SQL - Any time line to move beyond Alpha version ? - posted by Manoj Samel <ma...@gmail.com> on 2014/11/24 23:53:51 UTC, 1 replies.
- Is spark streaming +MlLib for online learning? - posted by Joanne Contact <jo...@gmail.com> on 2014/11/25 01:40:46 UTC, 3 replies.
- Negative Accumulators - posted by Peter Thai <th...@gmail.com> on 2014/11/25 03:31:26 UTC, 2 replies.
- Spark performance optimization examples - posted by SK <sk...@gmail.com> on 2014/11/25 03:32:17 UTC, 1 replies.
- Spark saveAsText file size - posted by Alan Prando <al...@scanboo.com.br> on 2014/11/25 03:38:14 UTC, 1 replies.
- Is Spark? or GraphX runs fast? a performance comparison on Page Rank - posted by Harihar Nahak <hn...@wynyardgroup.com> on 2014/11/25 04:02:08 UTC, 3 replies.
- Control number of parquet generated from JavaSchemaRDD - posted by tridib <tr...@live.com> on 2014/11/25 05:24:25 UTC, 8 replies.
- How to access application name in the spark framework code. - posted by rapelly kartheek <ka...@gmail.com> on 2014/11/25 05:40:16 UTC, 2 replies.
- Edge List File in GraphX - posted by Deep Pradhan <pr...@gmail.com> on 2014/11/25 07:22:16 UTC, 0 replies.
- Spark SQL Join returns less rows that expected - posted by david <da...@free.fr> on 2014/11/25 09:08:51 UTC, 2 replies.
- Understanding stages in WebUI - posted by Tsai Li Ming <ma...@ltsai.com> on 2014/11/25 09:22:40 UTC, 0 replies.
- K-means clustering - posted by amin mohebbi <am...@yahoo.com.INVALID> on 2014/11/25 11:39:58 UTC, 1 replies.
- restructure key-value pair with lambda in java - posted by robertrichter <ro...@gmail.com> on 2014/11/25 12:54:11 UTC, 0 replies.
- Lifecycle of RDD in spark-streaming - posted by Mukesh Jha <me...@gmail.com> on 2014/11/25 13:02:20 UTC, 11 replies.
- MlLib Colaborative filtering factors - posted by Saurabh Agrawal <sa...@markit.com> on 2014/11/25 13:28:44 UTC, 1 replies.
- Spark cluster with Java 8 using ./spark-ec2 - posted by Jon Chase <jo...@gmail.com> on 2014/11/25 15:06:02 UTC, 0 replies.
- ALS train error - posted by Saurabh Agrawal <sa...@markit.com> on 2014/11/25 15:14:39 UTC, 0 replies.
- Remapping columns from a schemaRDD - posted by Daniel Haviv <da...@gmail.com> on 2014/11/25 16:02:01 UTC, 4 replies.
- Spark yarn cluster Application Master not running yarn container - posted by firemonk9 <dh...@gmail.com> on 2014/11/25 17:11:21 UTC, 0 replies.
- why MatrixFactorizationModel private? - posted by jamborta <ja...@gmail.com> on 2014/11/25 17:26:12 UTC, 4 replies.
- Spark shell running on mesos - posted by José Guilherme Vanz <gu...@gmail.com> on 2014/11/25 18:21:25 UTC, 0 replies.
- RDD Cache Cleanup - posted by sranga <sr...@gmail.com> on 2014/11/25 18:54:58 UTC, 1 replies.
- Why is this operation so expensive - posted by Steve Lewis <lo...@gmail.com> on 2014/11/25 19:06:50 UTC, 2 replies.
- Spark sql UDF for array aggergation - posted by "Barua, Seemanto" <se...@jpmchase.com.INVALID> on 2014/11/25 19:10:44 UTC, 1 replies.
- RDD C - posted by sranga <sr...@gmail.com> on 2014/11/25 19:34:59 UTC, 0 replies.
- using MultipleOutputFormat to ensure one output file per key - posted by Arpan Ghosh <ar...@automatic.com> on 2014/11/25 20:30:29 UTC, 1 replies.
- RE: Spark SQL parser bug? - posted by Leon <pa...@gmail.com> on 2014/11/25 20:35:45 UTC, 1 replies.
- How to execute a custom python library on spark - posted by Chengi Liu <ch...@gmail.com> on 2014/11/25 21:09:27 UTC, 1 replies.
- Kryo NPE with Array - posted by Simone Franzini <ca...@gmail.com> on 2014/11/25 21:38:27 UTC, 1 replies.
- Data Source for Spark SQL - posted by ken <ke...@verizon.com> on 2014/11/25 23:42:57 UTC, 0 replies.
- Submitting job from local to EC2 cluster - posted by Yingkai Hu <yi...@gmail.com> on 2014/11/26 01:19:07 UTC, 1 replies.
- Classpath issue: Custom authentication with sparkSQL/Spark 1.2 - posted by "arin.g" <ar...@yahoo.com> on 2014/11/26 01:49:26 UTC, 0 replies.
- IDF model error - posted by Shivani Rao <ra...@gmail.com> on 2014/11/26 03:09:42 UTC, 3 replies.
- Issue with Spark latest 1.2.0 build - ClassCastException from [B to SerializableWritable - posted by lokeshkumar <lo...@dataken.net> on 2014/11/26 03:42:23 UTC, 4 replies.
- Spark on YARN - master role - posted by Praveen Sripati <pr...@gmail.com> on 2014/11/26 03:45:46 UTC, 0 replies.
- do not assemble the spark example jar - posted by lihu <li...@gmail.com> on 2014/11/26 04:50:52 UTC, 4 replies.
- Spark 1.1.0 and HBase: Snappy UnsatisfiedLinkError - posted by Pietro Gentile <pi...@gmail.com> on 2014/11/26 07:07:30 UTC, 0 replies.
- Accessing posterior probability of Naive Baye's prediction - posted by jatinpreet <ja...@gmail.com> on 2014/11/26 07:11:06 UTC, 5 replies.
- Re: How to do broadcast join in SparkSQL - posted by Jianshi Huang <ji...@gmail.com> on 2014/11/26 07:13:55 UTC, 1 replies.
- Spark setup on local windows machine - posted by Sunita Arvind <su...@gmail.com> on 2014/11/26 07:54:24 UTC, 2 replies.
- configure to run multiple tasks on a core - posted by yotto <yo...@autodesk.com> on 2014/11/26 07:57:14 UTC, 4 replies.
- S3NativeFileSystem inefficient implementation when calling sc.textFile - posted by Tomer Benyamini <to...@gmail.com> on 2014/11/26 09:06:29 UTC, 7 replies.
- Spark SQL performance and data size constraints - posted by SK <sk...@gmail.com> on 2014/11/26 09:16:46 UTC, 1 replies.
- Auto BroadcastJoin optimization failed in latest Spark - posted by Jianshi Huang <ji...@gmail.com> on 2014/11/26 09:36:22 UTC, 3 replies.
- Starting the thrift server - posted by Daniel Haviv <da...@gmail.com> on 2014/11/26 10:48:08 UTC, 0 replies.
- Spark Job submit - posted by Naveen Kumar Pokala <np...@spcapitaliq.com> on 2014/11/26 11:50:00 UTC, 4 replies.
- RMSE in MovieLensALS increases or stays stable as iterations increase. - posted by Kostas Kloudas <kk...@gmail.com> on 2014/11/26 12:57:48 UTC, 2 replies.
- Re: RMSE in MovieLensALS increases or stays stable as iterations increase. - posted by Nick Pentreath <ni...@gmail.com> on 2014/11/26 13:04:06 UTC, 6 replies.
- Having problem with Spark streaming with Kinesis - posted by "A.K.M. Ashrafuzzaman" <as...@gmail.com> on 2014/11/26 13:23:05 UTC, 4 replies.
- Number of executors and tasks - posted by Praveen Sripati <pr...@gmail.com> on 2014/11/26 13:24:02 UTC, 2 replies.
- Re: Running spark in standlone alone mode, saveAsTextFile() runs for forever - posted by lalit1303 <la...@sigmoidanalytics.com> on 2014/11/26 13:58:09 UTC, 0 replies.
- SchemaRDD compute function - posted by Jörg Schad <jo...@gmail.com> on 2014/11/26 14:05:21 UTC, 1 replies.
- how to force graphx to execute transfomtation - posted by Hlib Mykhailenko <hl...@inria.fr> on 2014/11/26 14:25:10 UTC, 2 replies.
- force spark to use all available memory on each node - posted by Hlib Mykhailenko <hl...@inria.fr> on 2014/11/26 14:55:09 UTC, 0 replies.
- Spark Mesos integration bug? - posted by Padmanabhan, "Mahesh  (contractor)" <ma...@twc-contractor.com> on 2014/11/26 16:36:35 UTC, 0 replies.
- SparkContext.textfile() cannot load file using UNC path on windows - posted by "Wang, Ningjun (LNG-NPV)" <ni...@lexisnexis.com> on 2014/11/26 16:48:08 UTC, 0 replies.
- Jetty as spark straming input - posted by Guy Doulberg <Gu...@perion.com> on 2014/11/26 17:06:40 UTC, 2 replies.
- Unable to generate assembly jar which includes jdbc-thrift server - posted by "vdiwakar.malladi" <vd...@gmail.com> on 2014/11/26 17:53:35 UTC, 9 replies.
- Executor failover - posted by Akshat Aranya <aa...@gmail.com> on 2014/11/26 18:13:10 UTC, 0 replies.
- This is just a test - posted by NingjunWang <ni...@lexisnexis.com> on 2014/11/26 18:36:06 UTC, 0 replies.
- RDD saveAsObjectFile write to local file and HDFS - posted by firemonk9 <dh...@gmail.com> on 2014/11/26 19:15:19 UTC, 2 replies.
- Map inside map - posted by Franco Barrientos <fr...@exalitica.com> on 2014/11/26 19:36:12 UTC, 1 replies.
- How can a function access Executor ID, Function ID and other parameters known to the Spark Environment - posted by Steve Lewis <lo...@gmail.com> on 2014/11/26 19:56:08 UTC, 0 replies.
- streaming tasks unevenly distributed among the executors? - posted by yuz1988 <yu...@gmail.com> on 2014/11/27 00:06:39 UTC, 0 replies.
- SchemaRDD.saveAsTable() when schema contains arrays and was loaded from a JSON file using schema auto-detection - posted by "Kelly, Jonathan" <jo...@amazon.com> on 2014/11/27 02:23:58 UTC, 3 replies.
- can't get smallint field from hive on spark - posted by 诺铁 <no...@gmail.com> on 2014/11/27 04:10:44 UTC, 4 replies.
- RDDs join problem: incorrect result - posted by liuboya <24...@qq.com> on 2014/11/27 05:07:38 UTC, 0 replies.
- Undirected Graphs in GraphX-Pregel - posted by Deep Pradhan <pr...@gmail.com> on 2014/11/27 06:21:17 UTC, 1 replies.
- Opening Spark on IntelliJ IDEA - posted by Taeyun Kim <ta...@innowireless.com> on 2014/11/27 07:00:16 UTC, 4 replies.
- (Unknown) - posted by rapelly kartheek <ka...@gmail.com> on 2014/11/27 07:06:57 UTC, 0 replies.
- How the sequence of blockManagerId's are constructed in spark/*/storage/blockManagerMasterActor.getPeers()? - posted by rapelly kartheek <ka...@gmail.com> on 2014/11/27 07:17:15 UTC, 0 replies.
- updateStateByKey - posted by Sunil Yarram <yv...@gmail.com> on 2014/11/27 08:12:22 UTC, 0 replies.
- GraphX:java.lang.NoSuchMethodError:org.apache.spark.graphx.Graph$.apply - posted by liuboya <24...@qq.com> on 2014/11/27 08:41:46 UTC, 1 replies.
- RDD decouple store implementations - posted by Guy Doulberg <Gu...@perion.com> on 2014/11/27 09:02:01 UTC, 0 replies.
- Spark 1.1.1 released but not available on maven repositories - posted by Luis Ángel Vicente Sánchez <la...@gmail.com> on 2014/11/27 11:42:29 UTC, 2 replies.
- Exception while starting thrift server - posted by "vdiwakar.malladi" <vd...@gmail.com> on 2014/11/27 12:52:49 UTC, 0 replies.
- [graphx] failed to submit an application with java.lang.ClassNotFoundException - posted by Yifan LI <ia...@gmail.com> on 2014/11/27 13:03:34 UTC, 0 replies.
- Best way to do a lookup in Spark - posted by Ashic Mahtab <as...@live.com> on 2014/11/27 14:36:32 UTC, 0 replies.
- reduceByKey and empty output files - posted by Praveen Sripati <pr...@gmail.com> on 2014/11/27 14:46:18 UTC, 0 replies.
- Mesos killing Spark Driver - posted by Gerard Maas <ge...@gmail.com> on 2014/11/27 15:38:28 UTC, 1 replies.
- Percentile - posted by Franco Barrientos <fr...@exalitica.com> on 2014/11/27 16:28:27 UTC, 2 replies.
- Using Breeze in the Scala Shell - posted by Dean Jones <de...@gmail.com> on 2014/11/27 18:15:59 UTC, 2 replies.
- ALS failure with size > Integer.MAX_VALUE - posted by Bharath Ravi Kumar <re...@gmail.com> on 2014/11/27 19:30:21 UTC, 3 replies.
- SVD Plus Plus in GraphX - posted by Deep Pradhan <pr...@gmail.com> on 2014/11/28 05:46:53 UTC, 0 replies.
- Creating a SchemaRDD from an existing API - posted by Niranda Perera <ni...@wso2.com> on 2014/11/28 07:31:13 UTC, 1 replies.
- Re: read both local path and HDFS path - posted by Prannoy <pr...@sigmoidanalytics.com> on 2014/11/28 08:21:59 UTC, 0 replies.
- Unable to compile spark 1.1.0 on windows 8.1 - posted by Ishwardeep Singh <is...@impetus.co.in> on 2014/11/28 08:30:49 UTC, 0 replies.
- Spark SQL 1.0.0 - RDD from snappy compress avro file - posted by cjdc <cr...@cern.ch> on 2014/11/28 09:41:57 UTC, 3 replies.
- Re: How to use FlumeInputDStream in spark cluster? - posted by Prannoy <pr...@sigmoidanalytics.com> on 2014/11/28 09:56:08 UTC, 0 replies.
- Deadlock between spark logging and wildfly logging - posted by Charles <ch...@cenx.com> on 2014/11/28 17:01:47 UTC, 3 replies.
- optimize multiple filter operations - posted by mrm <ma...@skimlinks.com> on 2014/11/28 17:21:04 UTC, 2 replies.
- Understanding and optimizing spark disk usage during a job. - posted by Jaonary Rabarisoa <ja...@gmail.com> on 2014/11/28 18:13:40 UTC, 1 replies.
- Re: Calling spark from a java web application. - posted by adrian <ad...@gmail.com> on 2014/11/28 20:03:17 UTC, 0 replies.
- Using Primitive collections in Spark - posted by sameerk <sa...@gmail.com> on 2014/11/29 09:40:18 UTC, 0 replies.
- Generating a DStream by existing textfiles - posted by yu <yu...@iastate.edu> on 2014/11/30 00:16:15 UTC, 1 replies.
- Appending with saveAsTextFile? - posted by YaoPau <jo...@gmail.com> on 2014/11/30 00:24:33 UTC, 0 replies.
- java.io.InvalidClassException: org.apache.spark.api.java.JavaUtils$SerializableMapWrapper; no valid constructor - posted by lokeshkumar <lo...@dataken.net> on 2014/11/30 03:01:16 UTC, 1 replies.
- Multiple SparkContexts in same Driver JVM - posted by lokeshkumar <lo...@dataken.net> on 2014/11/30 05:37:18 UTC, 1 replies.
- Loading JSON Dataset fails with com.fasterxml.jackson.databind.JsonMappingException - posted by Peter Vandenabeele <pe...@vandenabeele.com> on 2014/11/30 13:10:06 UTC, 1 replies.
- kafka pipeline exactly once semantics - posted by Josh J <jo...@gmail.com> on 2014/11/30 14:17:26 UTC, 0 replies.
- Re: Publishing a transformed DStream to Kafka - posted by Josh J <jo...@gmail.com> on 2014/11/30 14:26:01 UTC, 1 replies.
- Setting network variables in spark-shell - posted by Brian Dolan <bu...@yahoo.com.INVALID> on 2014/11/30 17:25:48 UTC, 2 replies.
- Is there any Spark implementation for Item-based Collaborative Filtering? - posted by shahab <sh...@gmail.com> on 2014/11/30 18:36:55 UTC, 3 replies.