You are viewing a plain text version of this content. The canonical link for it is here.
- Re: Mapping Hadoop Reduce to Spark - posted by Matei Zaharia <ma...@gmail.com> on 2014/09/01 00:48:15 UTC, 5 replies.
- numpy digitize - posted by filipus <fl...@gmail.com> on 2014/09/01 02:06:56 UTC, 0 replies.
- Re: What does "appMasterRpcPort: -1" indicate ? - posted by Tao Xiao <xi...@gmail.com> on 2014/09/01 02:47:47 UTC, 0 replies.
- Spark+OpenCV: Real Time Image Processing - posted by Varuzhan <la...@gmail.com> on 2014/09/01 02:51:39 UTC, 0 replies.
- RE: The concurrent model of spark job/stage/task - posted by "Liu, Raymond" <ra...@intel.com> on 2014/09/01 03:00:08 UTC, 0 replies.
- RE: how to filter value in spark - posted by "Liu, Raymond" <ra...@intel.com> on 2014/09/01 03:23:13 UTC, 1 replies.
- HELP! EXPORT DATA FROM HIVE TO SQL SERVER - posted by churly lin <ch...@gmail.com> on 2014/09/01 04:41:40 UTC, 1 replies.
- [Stream] Checkpointing | chmod: cannot access `/cygdrive/d/tmp/spark/f8e594bf-d940-41cb-ab0e-0fd3710696cb/rdd-57/.part-00001-attempt-215': No such file or directory - posted by Aniket Bhatnagar <an...@gmail.com> on 2014/09/01 08:18:54 UTC, 1 replies.
- RDD.pipe error on context cleaning - posted by Jaonary Rabarisoa <ja...@gmail.com> on 2014/09/01 10:14:29 UTC, 0 replies.
- operations on replicated RDD - posted by rapelly kartheek <ka...@gmail.com> on 2014/09/01 10:42:24 UTC, 0 replies.
- Spark driver application can not connect to Spark-Master - posted by moon soo Lee <mo...@nflabs.com> on 2014/09/01 10:46:45 UTC, 2 replies.
- Can value in spark-defaults.conf support system variables? - posted by Zhanfeng Huo <hu...@gmail.com> on 2014/09/01 11:53:57 UTC, 2 replies.
- Has anybody faced SPARK-2604 issue regarding Application hang state - posted by twinkle sachdeva <tw...@gmail.com> on 2014/09/01 13:27:52 UTC, 0 replies.
- Value of SHUFFLE_PARTITIONS - posted by Chirag Aggarwal <Ch...@guavus.com> on 2014/09/01 14:02:18 UTC, 0 replies.
- [Streaming] Triggering an action in absence of data - posted by Aniket Bhatnagar <an...@gmail.com> on 2014/09/01 14:25:00 UTC, 1 replies.
- Re: Problem Accessing Hive Table from hiveContext - posted by Yin Huai <hu...@gmail.com> on 2014/09/01 15:36:13 UTC, 0 replies.
- Re: transforming a Map object to RDD - posted by Matthew Farrellee <ma...@redhat.com> on 2014/09/01 16:38:04 UTC, 0 replies.
- Spark and Shark - posted by "Arthur.hk.chan@gmail.com" <ar...@gmail.com> on 2014/09/01 17:21:21 UTC, 2 replies.
- Re: Time series forecasting - posted by filipus <fl...@gmail.com> on 2014/09/01 22:03:20 UTC, 0 replies.
- Spark 1.0.2 Can GroupByTest example be run in Eclipse without change - posted by Shing Hing Man <ma...@yahoo.com.INVALID> on 2014/09/01 23:08:54 UTC, 1 replies.
- zip equal-length but unequally-partition - posted by Kevin Jung <it...@samsung.com> on 2014/09/02 05:39:18 UTC, 2 replies.
- Spark-shell return results when the job is executing? - posted by Hao Wang <wh...@gmail.com> on 2014/09/02 07:16:23 UTC, 2 replies.
- Unsupported language features in query - posted by centerqi hu <ce...@gmail.com> on 2014/09/02 09:35:14 UTC, 4 replies.
- New features (Discretization) for v1.x in xiangrui.pdf - posted by filipus <fl...@gmail.com> on 2014/09/02 10:50:19 UTC, 4 replies.
- pyspark yarn got exception - posted by Oleg Ruchovets <or...@gmail.com> on 2014/09/02 11:42:01 UTC, 20 replies.
- save schemardd to hive - posted by centerqi hu <ce...@gmail.com> on 2014/09/02 11:50:03 UTC, 5 replies.
- Re: [Streaming] Cannot get executors to stay alive - posted by Yana <ya...@gmail.com> on 2014/09/02 15:57:39 UTC, 0 replies.
- Spark Java Configuration. - posted by pcsenthil <pc...@gmail.com> on 2014/09/02 16:02:48 UTC, 1 replies.
- Re: Where to save intermediate results? - posted by Daniel Siegmann <da...@velos.io> on 2014/09/02 16:04:33 UTC, 0 replies.
- Using Spark's ActionSystem for performing analytics using Akka - posted by Aniket Bhatnagar <an...@gmail.com> on 2014/09/02 16:13:01 UTC, 0 replies.
- Spark on YARN question - posted by Greg Hill <gr...@RACKSPACE.COM> on 2014/09/02 17:06:19 UTC, 6 replies.
- Spark on Mesos: Pyspark python libraries - posted by Daniel Rodriguez <df...@gmail.com> on 2014/09/02 18:31:41 UTC, 1 replies.
- Serialized 3rd party libs - posted by Matt Narrell <ma...@gmail.com> on 2014/09/02 18:45:28 UTC, 2 replies.
- Re: saveAsTextFile makes no progress without caching RDD - posted by jerryye <je...@gmail.com> on 2014/09/02 18:48:45 UTC, 0 replies.
- Re: Possible to make one executor be able to work on multiple tasks simultaneously? - posted by Sean Owen <so...@cloudera.com> on 2014/09/02 19:05:30 UTC, 1 replies.
- Regarding function unpersist on rdd - posted by Zijing Guo <al...@yahoo.com.INVALID> on 2014/09/02 19:10:22 UTC, 0 replies.
- Publishing a transformed DStream to Kafka - posted by Massimiliano Tomassi <ma...@gmail.com> on 2014/09/02 19:12:50 UTC, 1 replies.
- pyspark and cassandra - posted by Oleg Ruchovets <or...@gmail.com> on 2014/09/02 20:10:12 UTC, 3 replies.
- mllib performance on cluster - posted by SK <sk...@gmail.com> on 2014/09/02 20:24:58 UTC, 7 replies.
- Re: Spark Streaming : Could not compute split, block not found - posted by Tim Smith <se...@gmail.com> on 2014/09/02 20:47:09 UTC, 0 replies.
- MLLib decision tree: Weights - posted by Sameer Tilak <ss...@live.com> on 2014/09/02 22:05:53 UTC, 2 replies.
- Re: spark-ec2 [Errno 110] Connection time out - posted by Daniil Osipov <da...@shazam.com> on 2014/09/02 22:08:22 UTC, 0 replies.
- flattening a list in spark sql - posted by gtinside <gt...@gmail.com> on 2014/09/02 22:26:15 UTC, 5 replies.
- Re: Spark Streaming with Kafka, building project with 'sbt assembly' is extremely slow - posted by Daniil Osipov <da...@shazam.com> on 2014/09/02 23:13:18 UTC, 2 replies.
- Creating an RDD in another RDD causes deadlock - posted by cjwang <cj...@cjwang.us> on 2014/09/02 23:14:48 UTC, 2 replies.
- I am looking for a Java sample of a Partitioner - posted by Steve Lewis <lo...@gmail.com> on 2014/09/02 23:27:52 UTC, 1 replies.
- Spark Streaming - how to implement multiple calculation using the same data set - posted by salemi <al...@udo.edu> on 2014/09/02 23:54:49 UTC, 2 replies.
- RE: [Streaming] Akka-based receiver with messages defined in uploaded jar - posted by Anton Brazhnyk <an...@genesys.com> on 2014/09/03 00:54:13 UTC, 1 replies.
- Re: [PySpark] large # of partitions causes OOM - posted by Matthew Farrellee <ma...@redhat.com> on 2014/09/03 02:12:46 UTC, 0 replies.
- What is the appropriate privileges needed for writting files into checkpoint directory? - posted by Tao Xiao <xi...@gmail.com> on 2014/09/03 04:28:23 UTC, 1 replies.
- Number of elements in ArrayBuffer - posted by Deep Pradhan <pr...@gmail.com> on 2014/09/03 06:43:05 UTC, 3 replies.
- Memcached error when using during map - posted by gavin zhang <ga...@gmail.com> on 2014/09/03 10:13:55 UTC, 0 replies.
- .sparkrc for Spark shell? - posted by Jianshi Huang <ji...@gmail.com> on 2014/09/03 10:47:24 UTC, 3 replies.
- RDDs - posted by rapelly kartheek <ka...@gmail.com> on 2014/09/03 11:02:56 UTC, 7 replies.
- SparkSQL TPC-H query 3 joining multiple tables - posted by Samay <sm...@gmail.com> on 2014/09/03 11:12:57 UTC, 1 replies.
- Exchanging data between pyspark and scala - posted by Dominik Hübner <co...@dhuebner.com> on 2014/09/03 11:42:49 UTC, 0 replies.
- Re: Invalid Class Exception - posted by niranda <ni...@wso2.com> on 2014/09/03 12:00:15 UTC, 0 replies.
- Support R in Spark - posted by oppokui <op...@gmail.com> on 2014/09/03 12:19:16 UTC, 9 replies.
- How to list all registered tables in a sql context? - posted by Jianshi Huang <ji...@gmail.com> on 2014/09/03 13:03:56 UTC, 3 replies.
- parsing json in spark streaming - posted by godraude <ed...@gmail.com> on 2014/09/03 13:52:27 UTC, 0 replies.
- How to clear broadcast variable from driver memory? - posted by Kevin Jung <it...@samsung.com> on 2014/09/03 13:56:43 UTC, 1 replies.
- pyspark on yarn hdp hortonworks - posted by Oleg Ruchovets <or...@gmail.com> on 2014/09/03 14:03:07 UTC, 2 replies.
- Message Passing among workers - posted by laxmanvemula <la...@gmail.com> on 2014/09/03 16:28:07 UTC, 1 replies.
- Re: [GraphX] how to set memory configurations to avoid OutOfMemoryError "GC overhead limit exceeded" - posted by Yifan LI <ia...@gmail.com> on 2014/09/03 17:58:09 UTC, 3 replies.
- Accessing neighboring elements in an RDD - posted by "Daniel, Ronald (ELS-SDG)" <R....@elsevier.com> on 2014/09/03 19:33:24 UTC, 5 replies.
- Re: Low Level Kafka Consumer for Spark - posted by Dibyendu Bhattacharya <di...@gmail.com> on 2014/09/03 19:38:55 UTC, 8 replies.
- How can I start history-server with kerberos HDFS ? - posted by Zhanfeng Huo <hu...@gmail.com> on 2014/09/03 20:00:13 UTC, 3 replies.
- spark history server trying to hit port 8021 - posted by Greg Hill <gr...@RACKSPACE.COM> on 2014/09/03 20:56:29 UTC, 2 replies.
- Web UI - posted by "Ruebenacker, Oliver A" <Ol...@altisource.com> on 2014/09/03 21:12:54 UTC, 9 replies.
- Spark Streaming into HBase - posted by kpeng1 <kp...@gmail.com> on 2014/09/03 23:05:06 UTC, 7 replies.
- How do you debug with the logs ? - posted by Yan Fang <ya...@gmail.com> on 2014/09/03 23:05:59 UTC, 0 replies.
- If master is local, where are master and workers? - posted by "Ruebenacker, Oliver A" <Ol...@altisource.com> on 2014/09/04 00:12:41 UTC, 3 replies.
- Re: Running Wordcount on large file stucks and throws OOM exception - posted by Zhan Zhang <zz...@hortonworks.com> on 2014/09/04 01:43:36 UTC, 0 replies.
- [MLib] How do you normalize features? - posted by Yana Kadiyska <ya...@gmail.com> on 2014/09/04 02:10:16 UTC, 1 replies.
- Multi-tenancy for Spark (Streaming) Applications - posted by Tobias Pfeiffer <tg...@preferred.jp> on 2014/09/04 03:30:30 UTC, 3 replies.
- Why spark on yarn applicationmaster cannot get a proper resourcemanager address from yarnconfiguration? - posted by 남윤민 <ro...@dgist.ac.kr> on 2014/09/04 04:15:56 UTC, 1 replies.
- Re: Is there any way to control the parallelism in LogisticRegression - posted by Jiusheng Chen <ch...@gmail.com> on 2014/09/04 04:18:06 UTC, 6 replies.
- resize memory size for caching RDD - posted by 牛兆捷 <nz...@gmail.com> on 2014/09/04 05:29:40 UTC, 1 replies.
- How to use memcached with spark - posted by gavin zhang <ga...@gmail.com> on 2014/09/04 05:50:01 UTC, 0 replies.
- Re: memory size for caching RDD - posted by Patrick Wendell <pw...@gmail.com> on 2014/09/04 07:45:24 UTC, 6 replies.
- Starting Thriftserver via hostname on Spark 1.1 RC4? - posted by Denny Lee <de...@gmail.com> on 2014/09/04 07:47:00 UTC, 3 replies.
- Programatically running of the Spark Jobs. - posted by Vicky Kak <vi...@gmail.com> on 2014/09/04 08:39:40 UTC, 7 replies.
- error: type mismatch while assigning RDD to RDD val object - posted by Dhimant <dh...@gmail.com> on 2014/09/04 08:56:21 UTC, 1 replies.
- Iterate over ArrayBuffer - posted by Deep Pradhan <pr...@gmail.com> on 2014/09/04 10:04:32 UTC, 2 replies.
- Multiple spark shell sessions - posted by Dhimant <dh...@gmail.com> on 2014/09/04 11:58:28 UTC, 3 replies.
- Spark streaming saveAsHadoopFiles API question - posted by Hemanth Yamijala <yh...@gmail.com> on 2014/09/04 12:10:15 UTC, 0 replies.
- Spark processes not doing on killing corresponding YARN application - posted by Hemanth Yamijala <yh...@gmail.com> on 2014/09/04 12:26:21 UTC, 2 replies.
- spark sql results maintain order (in python) - posted by jamborta <ja...@gmail.com> on 2014/09/04 12:42:25 UTC, 1 replies.
- Is "cluster manager" same as "master"? - posted by "Ruebenacker, Oliver A" <Ol...@altisource.com> on 2014/09/04 14:52:14 UTC, 1 replies.
- Using Spark to add data to an existing Parquet file without a schema - posted by Jim Carroll <ji...@gmail.com> on 2014/09/04 15:18:34 UTC, 1 replies.
- Object serialisation inside closures - posted by Andrianasolo Fanilo <fa...@worldline.com> on 2014/09/04 15:29:00 UTC, 4 replies.
- [Spark Streaming] Tracking/solving 'block input not found' - posted by Gerard Maas <ge...@gmail.com> on 2014/09/04 15:33:31 UTC, 1 replies.
- subscribe - posted by Erik van oosten <e....@grons.nl> on 2014/09/04 16:22:32 UTC, 0 replies.
- re: advice on spark input development - python or scala? - posted by Johnny Kelsey <jk...@semblent.com> on 2014/09/04 16:49:40 UTC, 1 replies.
- advice sought on spark/cassandra input development - scala or python? - posted by Johnny Kelsey <jk...@semblent.com> on 2014/09/04 17:03:13 UTC, 2 replies.
- Any issues with repartition? - posted by Arun Ahuja <aa...@gmail.com> on 2014/09/04 17:50:44 UTC, 4 replies.
- EC2 - JNI crashes JVM with multi core instances - posted by Iriasthor <mi...@gmail.com> on 2014/09/04 17:58:12 UTC, 2 replies.
- Setting Java properties for Standalone on Windows 7? - posted by "Ruebenacker, Oliver A" <Ol...@altisource.com> on 2014/09/04 18:09:26 UTC, 0 replies.
- 2 python installations cause PySpark on Yarn problem - posted by Oleg Ruchovets <or...@gmail.com> on 2014/09/04 18:25:51 UTC, 2 replies.
- pandas-like dataframe in spark - posted by Mohit Jaggi <mo...@gmail.com> on 2014/09/04 18:27:46 UTC, 2 replies.
- efficient zipping of lots of RDDs - posted by Mohit Jaggi <mo...@gmail.com> on 2014/09/04 18:36:18 UTC, 1 replies.
- spark streaming - saving DStream into HBASE doesn't work - posted by salemi <al...@udo.edu> on 2014/09/04 18:56:07 UTC, 0 replies.
- SchemaRDD - Parquet - "insertInto" makes many files - posted by DanteSama <ch...@sojo.com> on 2014/09/04 19:40:49 UTC, 5 replies.
- Re: Viewing web UI after fact - posted by Andrew Or <an...@databricks.com> on 2014/09/04 20:20:29 UTC, 3 replies.
- Re: API to add/remove containers inside an application - posted by Praveen Seluka <ps...@qubole.com> on 2014/09/04 21:26:53 UTC, 1 replies.
- Reduce truncates RDD in standalone, but fine when local. - posted by "Ruebenacker, Oliver A" <Ol...@altisource.com> on 2014/09/04 22:15:01 UTC, 1 replies.
- Getting the type of an RDD in spark AND pyspark - posted by esamanas <ev...@gmail.com> on 2014/09/04 22:32:07 UTC, 2 replies.
- TimeStamp selection with SparkSQL - posted by Benjamin Zaitlen <qu...@gmail.com> on 2014/09/04 23:36:48 UTC, 3 replies.
- spark RDD join Error - posted by Veeranagouda Mukkanagoudar <ve...@gmail.com> on 2014/09/05 01:36:54 UTC, 3 replies.
- How spark parallelize maps Slices to tasks/executors/workers - posted by "Mozumder, Monir" <Mo...@amd.com> on 2014/09/05 03:55:04 UTC, 1 replies.
- Serialize input path - posted by jerryye <je...@gmail.com> on 2014/09/05 04:45:38 UTC, 2 replies.
- NotSerializableException: org.apache.spark.sql.hive.api.java.JavaHiveContext - posted by Bijoy Deb <bi...@gmail.com> on 2014/09/05 08:19:05 UTC, 0 replies.
- PySpark on Yarn a lot of python scripts project - posted by Oleg Ruchovets <or...@gmail.com> on 2014/09/05 10:28:09 UTC, 9 replies.
- Recursion - posted by Deep Pradhan <pr...@gmail.com> on 2014/09/05 11:16:51 UTC, 1 replies.
- question on replicate() in blockManager.scala - posted by rapelly kartheek <ka...@gmail.com> on 2014/09/05 11:19:51 UTC, 1 replies.
- Spark that integrates with Kafka 0.7 - posted by Hemanth Yamijala <yh...@gmail.com> on 2014/09/05 12:07:05 UTC, 2 replies.
- Running spark-shell (or queries) over the network (not from master) - posted by Ognen Duzlevski <og...@gmail.com> on 2014/09/05 12:32:57 UTC, 11 replies.
- New sbt plugin to deploy jobs to EC2 - posted by Felix Garcia Borrego <fb...@gilt.com> on 2014/09/05 12:37:59 UTC, 2 replies.
- error: type mismatch while Union - posted by Dhimant <dh...@gmail.com> on 2014/09/05 13:58:36 UTC, 4 replies.
- spark 1.1.0 requested array size exceed vm limits - posted by marylucy <qa...@hotmail.com> on 2014/09/05 15:40:51 UTC, 2 replies.
- Task not serializable - posted by Sarath Chandra <sa...@algofusiontech.com> on 2014/09/05 16:06:52 UTC, 10 replies.
- Spark-cassandra-connector 1.0.0-rc5: java.io.NotSerializableException - posted by Shing Hing Man <ma...@yahoo.com.INVALID> on 2014/09/05 16:26:51 UTC, 0 replies.
- replicated rdd storage problem - posted by rapelly kartheek <ka...@gmail.com> on 2014/09/05 17:42:24 UTC, 0 replies.
- spark-streaming-kafka with broadcast variable - posted by Penny Espinoza <pe...@societyconsulting.com> on 2014/09/05 19:36:10 UTC, 1 replies.
- Re: how to choose right DStream batch interval - posted by qihong <qc...@pivotal.io> on 2014/09/05 21:09:32 UTC, 3 replies.
- Re: Shared variable in Spark Streaming - posted by Chris Fregly <ch...@fregly.com> on 2014/09/05 21:42:07 UTC, 1 replies.
- Repartition inefficient - posted by "anthonyjschulte@gmail.com" <an...@gmail.com> on 2014/09/05 22:09:44 UTC, 0 replies.
- Re: Sparse Matrices support in Spark - posted by "anthonyjschulte@gmail.com" <an...@gmail.com> on 2014/09/05 22:24:59 UTC, 1 replies.
- prepending jars to the driver class path for spark-submit on YARN - posted by Penny Espinoza <pe...@societyconsulting.com> on 2014/09/06 01:33:32 UTC, 7 replies.
- Re: Huge matrix - posted by Debasish Das <de...@gmail.com> on 2014/09/06 03:14:07 UTC, 20 replies.
- Array and RDDs - posted by Deep Pradhan <pr...@gmail.com> on 2014/09/06 07:21:47 UTC, 1 replies.
- How to change the values in Array of Bytes - posted by Deep Pradhan <pr...@gmail.com> on 2014/09/06 11:09:39 UTC, 1 replies.
- unsubscribe - posted by Murali Raju <mu...@infrastacks.com> on 2014/09/06 13:48:17 UTC, 2 replies.
- Spark SQL check if query is completed (pyspark) - posted by jamborta <ja...@gmail.com> on 2014/09/06 15:02:05 UTC, 4 replies.
- Q: About scenarios where driver execution flow may block... - posted by didata <su...@didata.us> on 2014/09/06 20:44:58 UTC, 1 replies.
- Spark Streaming and database access (e.g. MySQL) - posted by jchen <jc...@pivotal.io> on 2014/09/07 08:24:12 UTC, 6 replies.
- Crawler and Scraper with different priorities - posted by Sandeep Singh <sa...@techaddict.me> on 2014/09/07 09:15:30 UTC, 3 replies.
- Adding quota to the ephemeral hdfs on a standalone spark cluster on ec2 - posted by Tomer Benyamini <to...@gmail.com> on 2014/09/07 14:27:18 UTC, 2 replies.
- Fwd: DELIVERY FAILURE: Error transferring to QCMBSJ601.HERMES.SI.SOCGEN; Maximum hop count exceeded. Message probably in a routing loop. - posted by Ognen Duzlevski <og...@gmail.com> on 2014/09/07 15:31:39 UTC, 0 replies.
- distcp on ec2 standalone spark cluster - posted by Tomer Benyamini <to...@gmail.com> on 2014/09/07 15:42:26 UTC, 11 replies.
- Spark groupByKey partition out of memory - posted by julyfire <he...@gmail.com> on 2014/09/08 02:23:09 UTC, 0 replies.
- Deployment model popularity - Standard vs. YARN vs. Mesos vs. SIMR - posted by Otis Gospodnetic <ot...@gmail.com> on 2014/09/08 03:29:05 UTC, 1 replies.
- Solving Systems of Linear Equations Using Spark? - posted by durin <ma...@simon-schaefer.net> on 2014/09/08 06:59:18 UTC, 6 replies.
- sharing off_heap rdds - posted by Manku Timma <ma...@gmail.com> on 2014/09/08 10:59:58 UTC, 0 replies.
- How to profile a spark application - posted by rapelly kartheek <ka...@gmail.com> on 2014/09/08 11:48:37 UTC, 4 replies.
- Re: Standalone spark cluster. Can't submit job programmatically -> java.io.InvalidClassException - posted by DrKhu <kh...@gmail.com> on 2014/09/08 15:36:45 UTC, 0 replies.
- spark application in cluster mode doesn't run correctly - posted by 남윤민 <ro...@dgist.ac.kr> on 2014/09/08 15:40:17 UTC, 0 replies.
- Error while running sparkSQL application in the cluster-mode environment - posted by 남윤민 <ro...@dgist.ac.kr> on 2014/09/08 15:46:45 UTC, 0 replies.
- How to scale large kafka topic - posted by richiesgr <ri...@gmail.com> on 2014/09/08 15:47:35 UTC, 0 replies.
- clarification for some spark on yarn configuration options - posted by Greg Hill <gr...@RACKSPACE.COM> on 2014/09/08 15:59:59 UTC, 9 replies.
- Cannot run SimpleApp as regular Java app - posted by ericacm <er...@gmail.com> on 2014/09/08 16:16:19 UTC, 4 replies.
- Spark SQL on Cassandra - posted by gtinside <gt...@gmail.com> on 2014/09/08 16:22:55 UTC, 1 replies.
- A problem for running MLLIB in amazon clound - posted by Hui Li <hl...@gmail.com> on 2014/09/08 16:23:47 UTC, 1 replies.
- groupBy gives non deterministic results - posted by redocpot <ju...@gmail.com> on 2014/09/08 16:29:23 UTC, 12 replies.
- How do you perform blocking IO in apache spark job? - posted by DrKhu <kh...@gmail.com> on 2014/09/08 17:30:50 UTC, 5 replies.
- If for YARN you use 'spark.yarn.jar', what is the LOCAL equivalent to that property ... - posted by "Dimension Data, LLC." <su...@didata.us> on 2014/09/08 18:35:21 UTC, 9 replies.
- Setting Kafka parameters in Spark Streaming - posted by Hemanth Yamijala <yh...@gmail.com> on 2014/09/08 18:48:37 UTC, 3 replies.
- Spark-submit ClassNotFoundException with JAR! - posted by Peter Aberline <pe...@gmail.com> on 2014/09/08 19:03:41 UTC, 0 replies.
- Input Field in Spark 1.1 Web UI - posted by Arun Ahuja <aa...@gmail.com> on 2014/09/08 21:25:07 UTC, 0 replies.
- saveAsHadoopFile into avro format - posted by Dariusz Kobylarz <da...@gmail.com> on 2014/09/08 21:46:17 UTC, 0 replies.
- Records - Input Byte - posted by danilopds <da...@gmail.com> on 2014/09/08 22:54:18 UTC, 1 replies.
- Recommendations for performance - posted by Manu Mukerji <ma...@gmail.com> on 2014/09/08 23:37:59 UTC, 0 replies.
- Querying a parquet file in s3 with an ec2 install - posted by Jim Carroll <ji...@gmail.com> on 2014/09/08 23:58:30 UTC, 5 replies.
- [Spark Streaming] java.lang.OutOfMemoryError: GC overhead limit exceeded - posted by Yan Fang <ya...@gmail.com> on 2014/09/09 01:37:36 UTC, 0 replies.
- Spark Web UI in Mesos mode - posted by SK <sk...@gmail.com> on 2014/09/09 01:45:49 UTC, 1 replies.
- Executor address issue: "CANNOT FIND ADDRESS" (Spark 0.9.1) - posted by Nicolas Mai <ni...@gmail.com> on 2014/09/09 02:51:23 UTC, 1 replies.
- Is the structure for a jar file for running Spark applications the same as that for Hadoop - posted by Steve Lewis <lo...@gmail.com> on 2014/09/09 03:56:33 UTC, 6 replies.
- Spark streaming for synchronous API - posted by Ron's Yahoo! <zl...@yahoo.com.INVALID> on 2014/09/09 04:27:38 UTC, 7 replies.
- Iterable of Strings - posted by Deep Pradhan <pr...@gmail.com> on 2014/09/09 08:07:42 UTC, 1 replies.
- Spark streaming: size of DStream - posted by julyfire <he...@gmail.com> on 2014/09/09 08:41:44 UTC, 9 replies.
- Accuracy hit in classification with Spark - posted by jatinpreet <ja...@gmail.com> on 2014/09/09 09:15:06 UTC, 7 replies.
- Filter function problem - posted by Blackeye <bl...@iit.demokritos.gr> on 2014/09/09 12:09:43 UTC, 3 replies.
- spark functionality similar to hadoop's RecordWriter close method - posted by robertberta <ro...@atigeo.com> on 2014/09/09 14:39:13 UTC, 1 replies.
- Problem in running mosek in spark cluster - java.lang.UnsatisfiedLinkError: no mosekjava7_0 in java.library.path at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1738) - posted by ayandas84 <ay...@gmail.com> on 2014/09/09 16:17:36 UTC, 0 replies.
- PySpark on Yarn - how group by data properly - posted by Oleg Ruchovets <or...@gmail.com> on 2014/09/09 18:56:03 UTC, 2 replies.
- RDD memory questions - posted by Boxian Dong <bo...@indoo.rs> on 2014/09/09 19:07:08 UTC, 4 replies.
- Re: streaming: code to simulate a network socket data source - posted by danilopds <da...@gmail.com> on 2014/09/09 20:14:27 UTC, 1 replies.
- Re: Problem in running mosek in spark cluster - java.lang.UnsatisfiedLinkError: no mosekjava7_0 in java.library.path at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1738) - posted by Yana Kadiyska <ya...@gmail.com> on 2014/09/09 21:10:10 UTC, 0 replies.
- spark on yarn history server + hdfs permissions issue - posted by Greg Hill <gr...@RACKSPACE.COM> on 2014/09/09 21:30:16 UTC, 1 replies.
- spark-streaming "Could not compute split" exception - posted by Penny Espinoza <pe...@societyconsulting.com> on 2014/09/09 22:13:53 UTC, 4 replies.
- Distributed Deep Learning Workshop with Scala, Akka, and Spark - posted by Alexy Khrabrov <al...@scalable.pro> on 2014/09/09 22:41:51 UTC, 0 replies.
- Spark HiveQL support plan - posted by "XUE, Xiaohui" <xi...@sap.com> on 2014/09/09 22:45:33 UTC, 1 replies.
- spark.cleaner.ttl and spark.streaming.unpersist - posted by Luis Ángel Vicente Sánchez <la...@gmail.com> on 2014/09/09 23:21:08 UTC, 6 replies.
- Yarn Driver OOME (Java heap space) when executors request map output locations - posted by jbeynon <jb...@gmail.com> on 2014/09/09 23:54:28 UTC, 3 replies.
- Spark caching questions - posted by Vladimir Rodionov <vr...@splicemachine.com> on 2014/09/10 01:13:58 UTC, 1 replies.
- Deregistered receiver for stream 0: Stopped by driver - posted by Sing Yip <si...@yahoo.com.INVALID> on 2014/09/10 01:20:54 UTC, 0 replies.
- Spark + AccumuloInputFormat - posted by Russ Weeks <rw...@newbrightidea.com> on 2014/09/10 02:13:27 UTC, 1 replies.
- Table not found: using jdbc console to query sparksql hive thriftserver - posted by alexandria1101 <al...@gmail.com> on 2014/09/10 02:16:42 UTC, 10 replies.
- how to run python examples in spark 1.1? - posted by freedafeng <fr...@yahoo.com> on 2014/09/10 02:27:32 UTC, 1 replies.
- serialization changes -- OOM - posted by Manku Timma <ma...@gmail.com> on 2014/09/10 02:39:06 UTC, 0 replies.
- EOFException when reading from HDFS - posted by kent <ke...@gmail.com> on 2014/09/10 02:43:35 UTC, 2 replies.
- Re: Spark EC2 standalone - Utils.fetchFile no such file or directory - posted by luanjunyi <lu...@gmail.com> on 2014/09/10 04:41:49 UTC, 0 replies.
- how to setup steady state stream partitions - posted by qihong <qc...@pivotal.io> on 2014/09/10 05:03:08 UTC, 4 replies.
- How to set java.library.path in a spark cluster - posted by ayandas84 <ay...@gmail.com> on 2014/09/10 06:14:17 UTC, 1 replies.
- Spark SQL -- more than two tables for join - posted by "boyingking@163.com" <bo...@163.com> on 2014/09/10 12:06:46 UTC, 6 replies.
- Dependency Problem with Spark / ScalaTest / SBT - posted by Thorsten Bergler <sp...@tbonline.de> on 2014/09/10 12:46:01 UTC, 5 replies.
- nested rdd operation - posted by Pavlos Katsogridakis <ka...@ics.forth.gr> on 2014/09/10 14:57:19 UTC, 1 replies.
- JavaPairRDD to JavaPairRDD based on key - posted by Tom <th...@gmail.com> on 2014/09/10 15:01:23 UTC, 1 replies.
- Global Variables in Spark Streaming - posted by Ravi Sharma <ra...@gmail.com> on 2014/09/10 15:55:12 UTC, 4 replies.
- How to scale more consumer to Kafka stream - posted by richiesgr <ri...@gmail.com> on 2014/09/10 16:16:50 UTC, 5 replies.
- Some techniques for improving application performance - posted by Will Benton <wi...@redhat.com> on 2014/09/10 16:18:24 UTC, 0 replies.
- Spark & NLP - posted by Paolo Platter <pa...@agilelab.it> on 2014/09/10 16:36:44 UTC, 2 replies.
- Cassandra connector - posted by wwilkins <ww...@expedia.com> on 2014/09/10 17:42:10 UTC, 2 replies.
- Se8sIang i Atau zorOi - posted by Edgar Vega <al...@gmail.com> on 2014/09/10 18:59:48 UTC, 0 replies.
- Re: DELIVERY FAILURE: Error transferring to QCMBSJ601.HERMES.SI.SOCGEN; Maximum hop count exceeded. Message probably in a routing loop. - posted by Marcelo Vanzin <va...@cloudera.com> on 2014/09/10 20:25:40 UTC, 1 replies.
- Hadoop Distributed Cache - posted by Maximo Gurmendez <mg...@dataxu.com> on 2014/09/10 21:11:46 UTC, 0 replies.
- [spark upgrade] Error communicating with MapOutputTracker when running test cases in latest spark - posted by Adrian Mocanu <am...@verticalscope.com> on 2014/09/10 21:23:57 UTC, 0 replies.
- Accumulo and Spark - posted by Megavolt <jb...@42six.com> on 2014/09/10 22:17:31 UTC, 1 replies.
- java.lang.ClassCastException: java.lang.Long cannot be cast to scala.Tuple2 - posted by Jeffrey Picard <jp...@columbia.edu> on 2014/09/10 23:00:00 UTC, 3 replies.
- PrintWriter error in foreach - posted by Arun Luthra <ar...@gmail.com> on 2014/09/11 00:46:57 UTC, 2 replies.
- GraphX : AssertionError - posted by Vipul Pandey <vi...@gmail.com> on 2014/09/11 01:31:28 UTC, 0 replies.
- Setting up jvm in pyspark from shell - posted by Mohit Singh <mo...@gmail.com> on 2014/09/11 07:43:25 UTC, 1 replies.
- Spark SQL Thrift JDBC server deployment for production - posted by vasiliy <za...@gmail.com> on 2014/09/11 08:55:36 UTC, 3 replies.
- Re: can fileStream() or textFileStream() remember state? - posted by vasiliy <za...@gmail.com> on 2014/09/11 10:42:06 UTC, 0 replies.
- SchemaRDD saveToCassandra - posted by lmk <la...@gmail.com> on 2014/09/11 11:37:59 UTC, 2 replies.
- Spark not installed + no access to web UI - posted by mrm <ma...@skimlinks.com> on 2014/09/11 11:40:19 UTC, 2 replies.
- Unpersist - posted by Deep Pradhan <pr...@gmail.com> on 2014/09/11 11:56:42 UTC, 2 replies.
- JMXSink for YARN deployment - posted by Vladimir Tretyakov <vl...@sematext.com> on 2014/09/11 13:30:14 UTC, 9 replies.
- Spark streaming stops computing while the receiver keeps running without any errors reported - posted by Aniket Bhatnagar <an...@gmail.com> on 2014/09/11 13:51:31 UTC, 4 replies.
- problem in using Spark-Cassandra connector - posted by Karunya Padala <Ka...@infotech-enterprises.com> on 2014/09/11 14:34:28 UTC, 2 replies.
- Spark on Raspberry Pi? - posted by Sandeep Singh <sa...@techaddict.me> on 2014/09/11 15:04:04 UTC, 5 replies.
- unable to create new native thread - posted by "Arthur.hk.chan@gmail.com" <ar...@gmail.com> on 2014/09/11 15:41:31 UTC, 0 replies.
- Some Serious Issue with Spark Streaming ? Blocks Getting Removed and Jobs have Failed.. - posted by Dibyendu Bhattacharya <di...@gmail.com> on 2014/09/11 16:13:30 UTC, 6 replies.
- Re[2]: HBase 0.96+ with Spark 1.0+ - posted by sp...@orbit-x.de on 2014/09/11 16:34:33 UTC, 4 replies.
- Python execution support on clusters - posted by david_allanus <da...@yahoo.com> on 2014/09/11 16:54:55 UTC, 0 replies.
- compiling spark source code - posted by rapelly kartheek <ka...@gmail.com> on 2014/09/11 17:27:48 UTC, 9 replies.
- Out of memory with Spark Streaming - posted by Aniket Bhatnagar <an...@gmail.com> on 2014/09/11 17:31:56 UTC, 5 replies.
- Spark SQL and running parquet tables? - posted by DanteSama <ch...@sojo.com> on 2014/09/11 19:20:53 UTC, 4 replies.
- Re: Spark SQL JDBC - posted by alexandria1101 <al...@gmail.com> on 2014/09/11 19:50:33 UTC, 2 replies.
- Network requirements between Driver, Master, and Slave - posted by Jim Carroll <ji...@gmail.com> on 2014/09/11 20:22:20 UTC, 3 replies.
- SparkSQL HiveContext TypeTag compile error - posted by Du Li <li...@yahoo-inc.com.INVALID> on 2014/09/11 20:25:59 UTC, 2 replies.
- Reading from multiple sockets - posted by Varad Joshi <vj...@pivotal.io> on 2014/09/11 20:52:43 UTC, 0 replies.
- RE: cannot read file form a local path - posted by "Mozumder, Monir" <Mo...@amd.com> on 2014/09/11 21:14:40 UTC, 1 replies.
- single worker vs multiple workers on each machine - posted by Mike Sam <mi...@gmail.com> on 2014/09/11 21:42:05 UTC, 2 replies.
- spark sql - create new_table as select * from table - posted by jamborta <ja...@gmail.com> on 2014/09/11 22:21:43 UTC, 5 replies.
- SparkContext and multi threads - posted by moon soo Lee <le...@gmail.com> on 2014/09/11 23:23:05 UTC, 0 replies.
- Fwd: 40 Minute Spark Time Out - posted by Victor Tso-Guillen <vt...@paxata.com> on 2014/09/12 01:07:43 UTC, 3 replies.
- Spark Streaming in 1 hour batch duration RDD files gets lost - posted by Jeoffrey Lim <je...@gmail.com> on 2014/09/12 01:47:24 UTC, 0 replies.
- Backwards RDD - posted by Victor Tso-Guillen <vt...@paxata.com> on 2014/09/12 02:00:27 UTC, 1 replies.
- Announcing Spark 1.1.0! - posted by Patrick Wendell <pw...@gmail.com> on 2014/09/12 02:12:38 UTC, 12 replies.
- Configuring Spark for heterogenous hardware - posted by Victor Tso-Guillen <vt...@paxata.com> on 2014/09/12 02:44:49 UTC, 4 replies.
- History server: ERROR ReplayListenerBus: Exception in parsing Spark event log - posted by SK <sk...@gmail.com> on 2014/09/12 02:49:11 UTC, 0 replies.
- coalesce on SchemaRDD in pyspark - posted by Brad Miller <bm...@eecs.berkeley.edu> on 2014/09/12 03:12:14 UTC, 3 replies.
- Applications status missing when Spark HA(zookeeper) enabled - posted by jason chen <py...@gmail.com> on 2014/09/12 04:34:56 UTC, 1 replies.
- Re: DistCP - Spark-based - posted by Nicholas Chammas <ni...@gmail.com> on 2014/09/12 05:20:14 UTC, 0 replies.
- SparkSQL hang due to - posted by linkpatrickliu <li...@live.com> on 2014/09/12 08:04:01 UTC, 0 replies.
- spark-1.1.0 with make-distribution.sh problem - posted by Zhanfeng Huo <hu...@gmail.com> on 2014/09/12 08:13:15 UTC, 3 replies.
- replicate() method in BlockManager.scala choosing only one node for replication. - posted by rapelly kartheek <ka...@gmail.com> on 2014/09/12 08:37:32 UTC, 1 replies.
- Yarn Over-allocating Containers - posted by praveen seluka <pr...@gmail.com> on 2014/09/12 08:44:33 UTC, 1 replies.
- Re: Computing mean and standard deviation by key - posted by rzykov <rz...@gmail.com> on 2014/09/12 08:46:51 UTC, 5 replies.
- Perserving conf files when restarting ec2 cluster - posted by jerryye <je...@gmail.com> on 2014/09/12 09:49:28 UTC, 0 replies.
- Re: Kyro deserialisation error - posted by ayandas84 <ay...@gmail.com> on 2014/09/12 10:11:57 UTC, 0 replies.
- Unable to ship external Python libraries in PYSPARK - posted by yh18190 <yh...@gmail.com> on 2014/09/12 10:39:53 UTC, 3 replies.
- Serving data - posted by Marius Soutier <mp...@gmail.com> on 2014/09/12 11:23:48 UTC, 10 replies.
- Using filter in joined dataset - posted by vishnu86 <vi...@yahoo.com> on 2014/09/12 11:29:20 UTC, 0 replies.
- Error "Driver disassociated" while running the spark job - posted by 남윤민 <ro...@dgist.ac.kr> on 2014/09/12 15:01:43 UTC, 1 replies.
- Re: What is a pre built package of Apache Spark - posted by "andrew.craft" <an...@shiftenergy.com> on 2014/09/12 15:50:57 UTC, 3 replies.
- Fwd: Define the name of the outputs with Java-Spark. - posted by Guillermo Ortiz <ko...@gmail.com> on 2014/09/12 16:45:29 UTC, 1 replies.
- How to initiate a shutdown of Spark Streaming context? - posted by stanley <wa...@yahoo.com> on 2014/09/12 16:59:11 UTC, 3 replies.
- Why I get java.lang.OutOfMemoryError: Java heap space with join ? - posted by Jaonary Rabarisoa <ja...@gmail.com> on 2014/09/12 17:37:34 UTC, 0 replies.
- Re: Use Case of mutable RDD - any ideas around will help. - posted by Patrick Wendell <pw...@gmail.com> on 2014/09/12 18:07:06 UTC, 9 replies.
- Nested Case Classes (Found and Required Same) - posted by iramaraju <ir...@gmail.com> on 2014/09/12 18:12:19 UTC, 3 replies.
- slides from df talk at global big data conference - posted by Mohit Jaggi <mo...@gmail.com> on 2014/09/12 18:18:30 UTC, 0 replies.
- Spark and Scala - posted by Deep Pradhan <pr...@gmail.com> on 2014/09/12 18:33:14 UTC, 10 replies.
- Stable spark streaming app - posted by Tim Smith <se...@gmail.com> on 2014/09/12 19:09:53 UTC, 8 replies.
- Re: split a RDD by pencetage - posted by "pankaj.arora" <pa...@gmail.com> on 2014/09/12 19:27:50 UTC, 0 replies.
- How to initialize StateDStream - posted by Soumitra Kumar <ku...@gmail.com> on 2014/09/12 20:36:37 UTC, 4 replies.
- Re: When does Spark switch from PROCESS_LOCAL to NODE_LOCAL or RACK_LOCAL? - posted by Nicholas Chammas <ni...@gmail.com> on 2014/09/12 21:12:56 UTC, 3 replies.
- Spark 1.1.0: Cannot load main class from JAR - posted by SK <sk...@gmail.com> on 2014/09/12 22:31:23 UTC, 2 replies.
- Re: SparkSQL hang due to - posted by Michael Armbrust <mi...@databricks.com> on 2014/09/12 22:50:11 UTC, 0 replies.
- Where do logs go in StandAlone mode - posted by Tim Smith <se...@gmail.com> on 2014/09/12 23:19:46 UTC, 0 replies.
- spark 1.1 failure. class conflict? - posted by freedafeng <fr...@yahoo.com> on 2014/09/12 23:20:20 UTC, 2 replies.
- Executor garbage collection - posted by Tim Smith <se...@gmail.com> on 2014/09/12 23:56:54 UTC, 0 replies.
- NullWritable not serializable - posted by Du Li <li...@yahoo-inc.com.INVALID> on 2014/09/13 02:47:44 UTC, 4 replies.
- sc.textFile problem due to newlines within a CSV record - posted by Mohit Jaggi <mo...@gmail.com> on 2014/09/13 04:43:58 UTC, 2 replies.
- workload for spark - posted by 牛兆捷 <nz...@gmail.com> on 2014/09/13 05:00:07 UTC, 0 replies.
- Looking for a good sample of Using Spark to do things Hadoop can do - posted by Steve Lewis <lo...@gmail.com> on 2014/09/13 07:32:05 UTC, 0 replies.
- SPARK_MASTER_IP - posted by Koert Kuipers <ko...@tresata.com> on 2014/09/13 08:03:05 UTC, 2 replies.
- Re: How to save mllib model to hdfs and reload it - posted by Yanbo Liang <ya...@gmail.com> on 2014/09/13 09:00:20 UTC, 0 replies.
- [mllib] LogisticRegressionWithLBFGS interface is not consistent with LogisticRegressionWithSGD - posted by Yanbo Liang <ya...@gmail.com> on 2014/09/13 11:12:21 UTC, 2 replies.
- RDDs and Immutability - posted by Deep Pradhan <pr...@gmail.com> on 2014/09/13 11:39:27 UTC, 1 replies.
- ReduceByKey performance optimisation - posted by Julien Carme <ju...@gmail.com> on 2014/09/13 11:46:33 UTC, 5 replies.
- Write 1 RDD to multiple output paths in one go - posted by Nick Chammas <ni...@gmail.com> on 2014/09/13 19:25:04 UTC, 4 replies.
- spark 1.1.0 unit tests fail - posted by Koert Kuipers <ko...@tresata.com> on 2014/09/14 02:27:17 UTC, 2 replies.
- Workload for spark testing - posted by 牛兆捷 <nz...@gmail.com> on 2014/09/14 03:23:02 UTC, 0 replies.
- Spark SQL - posted by rkishore999 <rk...@yahoo.com> on 2014/09/14 07:29:26 UTC, 1 replies.
- Broadcast error - posted by Chengi Liu <ch...@gmail.com> on 2014/09/14 11:20:07 UTC, 14 replies.
- File operations on spark - posted by rapelly kartheek <ka...@gmail.com> on 2014/09/14 12:21:15 UTC, 0 replies.
- Driver fail with out of memory exception - posted by richiesgr <ri...@gmail.com> on 2014/09/14 13:41:02 UTC, 1 replies.
- object hbase is not a member of package org.apache.hadoop - posted by "Arthur.hk.chan@gmail.com" <ar...@gmail.com> on 2014/09/14 16:36:37 UTC, 8 replies.
- failed to run SimpleApp locally on macbook - posted by Gary Zhao <ga...@gmail.com> on 2014/09/14 19:15:29 UTC, 0 replies.
- Re: HBase 0.96+ with Spark 1.0+ - posted by Reinis Vicups <sp...@orbit-x.de> on 2014/09/14 19:21:26 UTC, 2 replies.
- Alternative to spark.executor.extraClassPath ? - posted by innowireless TaeYun Kim <ta...@innowireless.co.kr> on 2014/09/15 04:31:07 UTC, 0 replies.
- PathFilter for newAPIHadoopFile? - posted by Eric Friedman <er...@gmail.com> on 2014/09/15 04:37:53 UTC, 6 replies.
- About SparkSQL 1.1.0 join between more than two table - posted by "boyingking@163.com" <bo...@163.com> on 2014/09/15 04:41:40 UTC, 2 replies.
- combineByKey throws ClassCastException - posted by Tao Xiao <xi...@gmail.com> on 2014/09/15 08:06:44 UTC, 2 replies.
- SparkSQL 1.1 hang when "DROP" or "LOAD" - posted by linkpatrickliu <li...@live.com> on 2014/09/15 08:34:38 UTC, 13 replies.
- Re: Developing a spark streaming application - posted by Santiago Mola <sm...@stratio.com> on 2014/09/15 09:32:50 UTC, 0 replies.
- Upgrading a standalone cluster on ec2 from 1.0.2 to 1.1.0 - posted by Tomer Benyamini <to...@gmail.com> on 2014/09/15 15:37:38 UTC, 0 replies.
- vertex active/inactive feature in Pregel API ? - posted by Yifan LI <ia...@gmail.com> on 2014/09/15 16:25:04 UTC, 5 replies.
- Found both spark.driver.extraClassPath and SPARK_CLASSPATH - posted by Koert Kuipers <ko...@tresata.com> on 2014/09/15 17:16:52 UTC, 1 replies.
- Compiler issues for multiple map on RDD - posted by Boromir Widas <vc...@gmail.com> on 2014/09/15 17:37:51 UTC, 2 replies.
- File I/O in spark - posted by rapelly kartheek <ka...@gmail.com> on 2014/09/15 17:51:58 UTC, 8 replies.
- scala 2.11? - posted by Mohit Jaggi <mo...@gmail.com> on 2014/09/15 18:28:23 UTC, 7 replies.
- Need help with ThriftServer/Spark1.1.0 - posted by Yana Kadiyska <ya...@gmail.com> on 2014/09/15 19:25:33 UTC, 0 replies.
- MLLib sparse vector - posted by Sameer Tilak <ss...@live.com> on 2014/09/15 20:28:39 UTC, 3 replies.
- Example of Geoprocessing with Spark - posted by Abel Coronado Iruegas <ac...@gmail.com> on 2014/09/15 20:30:31 UTC, 5 replies.
- Dealing with Time Series Data - posted by Gary Malouf <ma...@gmail.com> on 2014/09/15 21:06:14 UTC, 1 replies.
- Efficient way to sum multiple columns - posted by jamborta <ja...@gmail.com> on 2014/09/15 22:00:57 UTC, 1 replies.
- Re: Spark Streaming union expected behaviour? - posted by Varad Joshi <vj...@pivotal.io> on 2014/09/15 22:20:15 UTC, 0 replies.
- minPartitions for non-text files? - posted by Eric Friedman <er...@gmail.com> on 2014/09/15 22:35:53 UTC, 4 replies.
- Weird aggregation results when reusing objects inside reduceByKey - posted by kriskalish <kr...@kalish.net> on 2014/09/15 22:58:27 UTC, 2 replies.
- Does Spark always wait for stragglers to finish running? - posted by Pramod Biligiri <pr...@gmail.com> on 2014/09/16 00:30:02 UTC, 3 replies.
- Invalid signature file digest for Manifest main attributes with spark job built using maven - posted by kpeng1 <kp...@gmail.com> on 2014/09/16 00:33:06 UTC, 2 replies.
- Convert GraphX Graph to Sparse Matrix - posted by crockpotveggies <ju...@outlook.com> on 2014/09/16 01:00:41 UTC, 0 replies.
- "apply at Option.scala:120" callback in Spark 1.1, but no user code involved? - posted by John Salvatier <js...@gmail.com> on 2014/09/16 01:39:04 UTC, 0 replies.
- Spark 1.1 / cdh4 stuck using old hadoop client? - posted by Paul Wais <pw...@yelp.com> on 2014/09/16 03:28:41 UTC, 5 replies.
- About SpakSQL OR MLlib - posted by "boyingking@163.com" <bo...@163.com> on 2014/09/16 05:38:57 UTC, 2 replies.
- How to set executor num on spark on yarn - posted by hequn cheng <ch...@gmail.com> on 2014/09/16 07:08:42 UTC, 1 replies.
- Complexity/Efficiency of SortByKey - posted by cjwang <cj...@cjwang.us> on 2014/09/16 07:31:47 UTC, 1 replies.
- Re: spark and mesos issue - posted by Gurvinder Singh <gu...@uninett.no> on 2014/09/16 08:17:10 UTC, 0 replies.
- SparkContext creation slow down unit tests - posted by 诺铁 <no...@gmail.com> on 2014/09/16 09:06:20 UTC, 2 replies.
- Spark Streaming: CoarseGrainedExecutorBackend: Slave registration failed: Duplicate executor ID - posted by Luis Ángel Vicente Sánchez <la...@gmail.com> on 2014/09/16 14:07:23 UTC, 3 replies.
- Reduce Tuple2 to Tuple2>> - posted by Tom <th...@gmail.com> on 2014/09/16 15:42:24 UTC, 1 replies.
- java.util.NoSuchElementException: key not found - posted by Brad Miller <bm...@eecs.berkeley.edu> on 2014/09/16 16:59:34 UTC, 1 replies.
- org.apache.spark.SparkException: java.io.FileNotFoundException: does not exist) - posted by Hui Li <li...@gmail.com> on 2014/09/16 17:08:21 UTC, 1 replies.
- Spark as a Library - posted by "Ruebenacker, Oliver A" <Ol...@altisource.com> on 2014/09/16 17:16:38 UTC, 5 replies.
- collect on hadoopFile RDD returns wrong results - posted by vasiliy <za...@gmail.com> on 2014/09/16 17:17:19 UTC, 5 replies.
- HBase and non-existent TableInputFormat - posted by "Y. Dong" <tq...@gmail.com> on 2014/09/16 17:18:20 UTC, 5 replies.
- RDD projection and sorting - posted by Sameer Tilak <ss...@live.com> on 2014/09/16 20:48:36 UTC, 0 replies.
- Spark processing small files. - posted by cem <ca...@gmail.com> on 2014/09/16 22:06:24 UTC, 0 replies.
- Indexed RDD - posted by Akshat Aranya <aa...@gmail.com> on 2014/09/16 22:11:13 UTC, 0 replies.
- Re: Categorical Features for K-Means Clustering - posted by st553 <st...@gmail.com> on 2014/09/16 23:04:54 UTC, 2 replies.
- Memory under-utilization - posted by francisco <ft...@nextag.com> on 2014/09/16 23:40:57 UTC, 4 replies.
- Questions about Spark speculation - posted by Nicolas Mai <ni...@gmail.com> on 2014/09/17 00:01:45 UTC, 1 replies.
- How do I manipulate values outside of a GraphX loop? - posted by crockpotveggies <ju...@outlook.com> on 2014/09/17 00:09:00 UTC, 0 replies.
- Problem with pyspark command line invocation -- option truncation... (Spark v1.1.0) ... - posted by "Dimension Data, LLC." <su...@didata.us> on 2014/09/17 00:22:33 UTC, 5 replies.
- MLlib - Possible to use SVM with Radial Basis Function kernel rather than Linear Kernel? - posted by Aris <ar...@gmail.com> on 2014/09/17 00:27:19 UTC, 3 replies.
- partitioned groupBy - posted by Akshat Aranya <aa...@gmail.com> on 2014/09/17 01:27:13 UTC, 3 replies.
- how to report documentation bug? - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2014/09/17 02:55:13 UTC, 1 replies.
- CPU RAM - posted by VJ Shalish <vj...@gmail.com> on 2014/09/17 05:14:52 UTC, 5 replies.
- YARN mode not available error - posted by Barrington <ba...@me.com> on 2014/09/17 05:47:13 UTC, 1 replies.
- The difference between pyspark.rdd.PipelinedRDD and pyspark.rdd.RDD - posted by edmond_huo <hu...@gmail.com> on 2014/09/17 07:03:14 UTC, 4 replies.
- permission denied on local dir - posted by style95 <st...@gmail.com> on 2014/09/17 07:23:20 UTC, 3 replies.
- Do I Need to Set Checkpoint Interval for Every DStream? - posted by Ji ZHANG <zh...@gmail.com> on 2014/09/17 10:49:14 UTC, 0 replies.
- About the Spark Scala API Excption - posted by churly lin <ch...@gmail.com> on 2014/09/17 12:02:30 UTC, 0 replies.
- Change RDDs using map() - posted by Deep Pradhan <pr...@gmail.com> on 2014/09/17 12:24:09 UTC, 2 replies.
- pyspark on yarn - lost executor - posted by Oleg Ruchovets <or...@gmail.com> on 2014/09/17 13:38:54 UTC, 5 replies.
- Short Circuit Local Reads - posted by Gary Malouf <ma...@gmail.com> on 2014/09/17 14:15:20 UTC, 3 replies.
- Number of partitions when saving (pyspark) - posted by Luis Guerra <lu...@gmail.com> on 2014/09/17 14:21:01 UTC, 2 replies.
- Adjacency List representation in Spark - posted by Harsha HN <99...@gmail.com> on 2014/09/17 15:43:22 UTC, 4 replies.
- list of documents sentiment analysis - problem with defining proper approach with Spark - posted by xn...@o2.pl, xn...@o2.pl on 2014/09/17 15:57:35 UTC, 0 replies.
- Spark and disk usage. - posted by Макар Красноперов <co...@gmail.com> on 2014/09/17 16:37:49 UTC, 6 replies.
- GroupBy Key and then sort values with the group - posted by ab...@thomsonreuters.com on 2014/09/17 17:37:57 UTC, 2 replies.
- How to ship cython library to workers? - posted by freedafeng <fr...@yahoo.com> on 2014/09/17 20:36:38 UTC, 0 replies.
- how to group within the messages at a vertex? - posted by spr <sp...@yarcdata.com> on 2014/09/17 20:39:19 UTC, 1 replies.
- Spark Streaming - batchDuration for streaming - posted by alJune <as...@gmail.com> on 2014/09/17 20:46:24 UTC, 1 replies.
- Re: OutOfMemoryError with basic kmeans - posted by st553 <st...@gmail.com> on 2014/09/17 21:16:30 UTC, 0 replies.
- How to run kmeans after pca? - posted by st553 <st...@gmail.com> on 2014/09/17 21:21:38 UTC, 3 replies.
- Re: set spark.local.dir on driver program doesn't take effect - posted by gphil <gp...@gphil.net> on 2014/09/17 22:23:22 UTC, 0 replies.
- storage.DiskBlockManager: Exception while deleting local spark dir - posted by Steve Lewis <lo...@gmail.com> on 2014/09/17 23:26:36 UTC, 0 replies.
- Size exceeds Integer.MAX_VALUE in BlockFetcherIterator - posted by francisco <ft...@nextag.com> on 2014/09/18 00:18:29 UTC, 3 replies.
- SPARK BENCHMARK TEST - posted by VJ Shalish <vj...@gmail.com> on 2014/09/18 00:59:48 UTC, 1 replies.
- spark-1.1.0-bin-hadoop2.4 java.lang.NoClassDefFoundError: org/codehaus/jackson/annotate/JsonClass - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2014/09/18 01:28:27 UTC, 2 replies.
- problem with HiveContext inside Actor - posted by Du Li <li...@yahoo-inc.com.INVALID> on 2014/09/18 01:50:49 UTC, 7 replies.
- LZO support in Spark 1.0.0 - nothing seems to work - posted by rogthefrog <ro...@amino.com> on 2014/09/18 02:40:26 UTC, 3 replies.
- How to time Spark SQL statement? - posted by Arun Luthra <ar...@gmail.com> on 2014/09/18 02:52:47 UTC, 1 replies.
- Shark queries fail after 10% completion with UnknownHostException - posted by David Rosenstrauch <da...@darose.net> on 2014/09/18 02:59:31 UTC, 0 replies.
- Re: Cannot run unit test. - posted by Jies <cp...@gmail.com> on 2014/09/18 04:24:28 UTC, 0 replies.
- MLLib: LIBSVM issue - posted by Sameer Tilak <ss...@live.com> on 2014/09/18 04:25:10 UTC, 4 replies.
- Move Spark configuration from SPARK_CLASSPATH to spark-default.conf , HiveContext went wrong with "Class com.hadoop.compression.lzo.LzoCodec not found" - posted by Zhun Shen <sh...@gmail.com> on 2014/09/18 04:55:23 UTC, 0 replies.
- SchemaRDD and RegisterAsTable - posted by "Addanki, Santosh Kumar" <sa...@sap.com> on 2014/09/18 06:47:19 UTC, 3 replies.
- Python version of kmeans - posted by MEETHU MATHEW <me...@yahoo.co.in> on 2014/09/18 07:26:40 UTC, 1 replies.
- SQL shell for Spark SQL? - posted by David Rosenstrauch <da...@darose.net> on 2014/09/18 07:50:14 UTC, 3 replies.
- Serious Issue with Spark Streaming ? Blocks Getting Removed and Jobs have Failed.. - posted by Rafeeq S <ra...@gmail.com> on 2014/09/18 08:42:55 UTC, 1 replies.
- how to track the jobs status without the webUI - posted by Tan Tim <un...@gmail.com> on 2014/09/18 09:03:30 UTC, 0 replies.
- some trouble with repartition - posted by Tan Tim <un...@gmail.com> on 2014/09/18 09:41:39 UTC, 0 replies.
- Unable to find proto buffer class error with RDD - posted by Paul Wais <pw...@yelp.com> on 2014/09/18 10:06:25 UTC, 6 replies.
- spark-submit: fire-and-forget mode? - posted by Tobias Pfeiffer <tg...@preferred.jp> on 2014/09/18 10:19:32 UTC, 7 replies.
- Strange exception while accessing hdfs from spark. - posted by Julien Carme <ju...@gmail.com> on 2014/09/18 10:30:12 UTC, 0 replies.
- Spark run slow after unexpected repartition - posted by shishu <sh...@zamplus.com> on 2014/09/18 10:46:39 UTC, 2 replies.
- Better way to process large image data set ? - posted by Jaonary Rabarisoa <ja...@gmail.com> on 2014/09/18 10:50:37 UTC, 2 replies.
- StackOverflowError - posted by gm yu <hu...@gmail.com> on 2014/09/18 12:07:41 UTC, 2 replies.
- Spot instances on Amazon EMR - posted by Grzegorz Białek <gr...@codilime.com> on 2014/09/18 12:19:55 UTC, 1 replies.
- [SparkStreaming] task failure with 'Unknown exception in doAs' - posted by Gerard Maas <ge...@gmail.com> on 2014/09/18 12:29:51 UTC, 1 replies.
- New API for TFIDF generation in Spark 1.1.0 - posted by jatinpreet <ja...@gmail.com> on 2014/09/18 12:46:31 UTC, 3 replies.
- Spark Package for Mesos - posted by John Omernik <jo...@omernik.com> on 2014/09/18 14:37:49 UTC, 0 replies.
- Kryo fails with avro having Arrays and unions, but succeeds with simple avro. - posted by "mohan.gadm" <mo...@gmail.com> on 2014/09/18 15:07:22 UTC, 8 replies.
- Odd error when using a rdd map within a stream map - posted by Filip Andrei <an...@gmail.com> on 2014/09/18 15:57:21 UTC, 2 replies.
- Joining multiple rowMatrix - posted by Debasish Das <de...@gmail.com> on 2014/09/18 16:09:37 UTC, 1 replies.
- Sending multiple DStream outputs - posted by Padmanabhan, "Mahesh  (contractor)" <ma...@twc-contractor.com> on 2014/09/18 17:07:02 UTC, 2 replies.
- Spark Streaming and ReactiveMongo - posted by t1ny <wb...@gmail.com> on 2014/09/18 17:48:25 UTC, 2 replies.
- schema for schema - posted by Eric Friedman <er...@gmail.com> on 2014/09/18 17:49:28 UTC, 3 replies.
- Spark SQL Exception - posted by Paul Magid <Pa...@toyota.com> on 2014/09/18 18:31:17 UTC, 2 replies.
- Spark Zmq issue in cluster mode - posted by Hatch M <ha...@gmail.com> on 2014/09/18 18:36:14 UTC, 0 replies.
- MLLib regression model weights - posted by Sameer Tilak <ss...@live.com> on 2014/09/18 19:30:28 UTC, 2 replies.
- Anybody built the branch for Adaptive Boosting, extension to MLlib by Manish Amde? - posted by Aris <ar...@gmail.com> on 2014/09/18 20:26:19 UTC, 2 replies.
- Kafka Spark Streaming on Spark 1.1 - posted by JiajiaJing <jj...@gmail.com> on 2014/09/18 20:47:01 UTC, 2 replies.
- Spark on EC2 - posted by Gilberto Lira <gi...@scanboo.com.br> on 2014/09/18 20:48:03 UTC, 1 replies.
- Spark + Mahout - posted by Daniel Takabayashi <ta...@scanboo.com.br> on 2014/09/18 20:49:53 UTC, 5 replies.
- spark 1.1 examples build failure on cdh 5.1 - posted by freedafeng <fr...@yahoo.com> on 2014/09/18 21:38:53 UTC, 0 replies.
- SVD on larger than taller matrix - posted by Glitch <at...@datacratic.com> on 2014/09/18 22:02:51 UTC, 2 replies.
- PairRDD's lookup method Performance - posted by Harsha HN <99...@gmail.com> on 2014/09/19 00:06:50 UTC, 1 replies.
- Unable to load app logs for MLLib programs in history server - posted by SK <sk...@gmail.com> on 2014/09/19 00:50:24 UTC, 2 replies.
- AbstractMethodError when creating cassandraTable object - posted by Emil Gustafsson <em...@cellfish.se> on 2014/09/19 01:49:32 UTC, 0 replies.
- request to merge the pull request #1893 to master - posted by freedafeng <fr...@yahoo.com> on 2014/09/19 02:12:20 UTC, 0 replies.
- paging through an RDD that's too large to collect() all at once - posted by dave-anderson <da...@pobox.com> on 2014/09/19 04:58:06 UTC, 2 replies.
- diamond dependency tree - posted by Victor Tso-Guillen <vt...@paxata.com> on 2014/09/19 05:55:33 UTC, 3 replies.
- spark-submit command-line with --files - posted by chinchu <ch...@gmail.com> on 2014/09/19 07:53:36 UTC, 6 replies.
- Time difference between Python and Scala - posted by Luis Guerra <lu...@gmail.com> on 2014/09/19 09:07:56 UTC, 1 replies.
- Powered By Spark - posted by Alexander Albul <go...@gmail.com> on 2014/09/19 09:32:18 UTC, 0 replies.
- basic streaming question - posted by motte1988 <wi...@studserv.uni-leipzig.de> on 2014/09/19 09:35:22 UTC, 0 replies.
- rsync problem - posted by rapelly kartheek <ka...@gmail.com> on 2014/09/19 10:02:46 UTC, 10 replies.
- Bulk-load to HBase - posted by innowireless TaeYun Kim <ta...@innowireless.co.kr> on 2014/09/19 12:17:19 UTC, 13 replies.
- Anyone have successful recipe for spark cassandra connector? - posted by gzoller <gz...@gmail.com> on 2014/09/19 16:42:42 UTC, 0 replies.
- return probability \ confidence instead of actual class - posted by Adamantios Corais <ad...@gmail.com> on 2014/09/19 18:43:53 UTC, 4 replies.
- Spark 1.1.0 (w/ hadoop 2.4) versus aws-java-sdk-1.7.2.jar - posted by tian zhang <tz...@yahoo.com.INVALID> on 2014/09/19 19:16:38 UTC, 0 replies.
- RDD pipe example. Is this a bug or a feature? - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2014/09/19 20:21:20 UTC, 4 replies.
- mllib performance on mesos cluster - posted by SK <sk...@gmail.com> on 2014/09/19 21:17:53 UTC, 2 replies.
- Problem with giving memory to executors on YARN - posted by Soumya Simanta <so...@gmail.com> on 2014/09/19 22:37:48 UTC, 3 replies.
- Reproducing the function of a Hadoop Reducer - posted by Steve Lewis <lo...@gmail.com> on 2014/09/19 22:37:56 UTC, 4 replies.
- Failed running Spark ALS - posted by "jw.cmu" <ji...@gmail.com> on 2014/09/19 23:35:01 UTC, 1 replies.
- Spark Streaming compilation error: algebird not a member of package com.twitter - posted by SK <sk...@gmail.com> on 2014/09/20 02:52:56 UTC, 1 replies.
- Spark Streaming: Calculate PV/UV by Minute and by Day? - posted by Ji ZHANG <zh...@gmail.com> on 2014/09/20 08:07:42 UTC, 0 replies.
- How to Exclude Spark Dependencies from spark-streaming-kafka? - posted by Ji ZHANG <zh...@gmail.com> on 2014/09/20 08:20:03 UTC, 0 replies.
- Fails to run simple Spark (Hello World) scala program - posted by Moshe Beeri <mo...@gmail.com> on 2014/09/20 09:55:58 UTC, 7 replies.
- spark job hung up - posted by Chen Song <ch...@gmail.com> on 2014/09/20 11:07:58 UTC, 0 replies.
- Re: Avoid broacasting huge variables - posted by "octavian.ganea" <oc...@inf.ethz.ch> on 2014/09/20 11:08:41 UTC, 4 replies.
- exception in spark 1.1.0 - posted by Chen Song <ch...@gmail.com> on 2014/09/20 11:12:58 UTC, 2 replies.
- secondary sort - posted by Koert Kuipers <ko...@tresata.com> on 2014/09/20 16:39:29 UTC, 2 replies.
- org.eclipse.jetty.orbit#javax.transaction;working@localhost: not found - posted by jinilover <co...@gmail.com> on 2014/09/20 17:45:32 UTC, 0 replies.
- SparkSQL Thriftserver in Mesos - posted by John Omernik <jo...@omernik.com> on 2014/09/20 19:16:09 UTC, 3 replies.
- Setting serializer to KryoSerializer from command line for spark-shell - posted by Soumya Simanta <so...@gmail.com> on 2014/09/20 20:40:07 UTC, 0 replies.
- Distributed dictionary building - posted by Debasish Das <de...@gmail.com> on 2014/09/20 21:10:25 UTC, 9 replies.
- java.lang.NegativeArraySizeException in pyspark - posted by Brad Miller <bm...@eecs.berkeley.edu> on 2014/09/20 21:42:28 UTC, 5 replies.
- understanding rdd pipe() and bin/spark-submit --master - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2014/09/20 21:51:26 UTC, 0 replies.
- Spark streaming twitter exception - posted by Maisnam Ns <ma...@gmail.com> on 2014/09/21 00:16:30 UTC, 3 replies.
- Could you please add us to 'Powered by Spark' List - posted by US Office Admin <ad...@Vectorum.com> on 2014/09/21 07:01:21 UTC, 0 replies.
- Saving RDD with array of strings - posted by Sarath Chandra <sa...@algofusiontech.com> on 2014/09/21 11:26:31 UTC, 1 replies.
- Setting up Spark 1.1 on Windows 7 - posted by Khaja M <kh...@gmail.com> on 2014/09/21 13:39:51 UTC, 2 replies.
- Issues with partitionBy: FetchFailed - posted by Julien Carme <ju...@gmail.com> on 2014/09/21 13:43:13 UTC, 8 replies.
- Shuffle size difference - operations on RDD vs. operations on SchemaRDD - posted by Grega Kešpret <gr...@celtra.com> on 2014/09/21 18:04:43 UTC, 1 replies.
- Can SparkContext shared across nodes/drivers - posted by 林武康 <vb...@gmail.com> on 2014/09/21 18:21:12 UTC, 0 replies.
- How to initialize updateStateByKey operation - posted by Soumitra Kumar <ku...@gmail.com> on 2014/09/21 19:43:01 UTC, 2 replies.
- java.lang.ClassNotFoundException on driver class in executor - posted by Barrington Henry <ba...@me.com> on 2014/09/21 21:52:53 UTC, 2 replies.
- Possibly a dumb question: differences between saveAsNewAPIHadoopFile and saveAsNewAPIHadoopDataset? - posted by innowireless TaeYun Kim <ta...@innowireless.co.kr> on 2014/09/22 08:24:44 UTC, 2 replies.
- Worker state is 'killed' - posted by Sarath Chandra <sa...@algofusiontech.com> on 2014/09/22 08:28:01 UTC, 0 replies.
- Spark - Apache Blur Connector : Index Kafka Messages into Blur using Spark Streaming - posted by Dibyendu Bhattacharya <di...@gmail.com> on 2014/09/22 10:21:37 UTC, 0 replies.
- Spark SQL 1.1.0: NPE when join two cached table - posted by Haopu Wang <HW...@qilinsoft.com> on 2014/09/22 10:34:49 UTC, 1 replies.
- Getting RDD load progress - posted by "cyril.ponomaryov" <cy...@gmail.com> on 2014/09/22 12:11:49 UTC, 0 replies.
- ParquetRecordReader warnings: counter initialization - posted by Andrew Ash <an...@andrewash.com> on 2014/09/22 12:27:20 UTC, 2 replies.
- Spark flume java.lang.ArrayIndexOutOfBoundsException: 1 - posted by centerqi hu <ce...@gmail.com> on 2014/09/22 14:32:05 UTC, 0 replies.
- Error while calculating the max temperature - posted by Praveen Sripati <pr...@gmail.com> on 2014/09/22 14:39:15 UTC, 3 replies.
- (Unknown) - posted by ji...@wipro.com on 2014/09/22 14:41:23 UTC, 2 replies.
- Unable to change the Ports - posted by ji...@wipro.com on 2014/09/22 14:44:08 UTC, 0 replies.
- Out of memory exception in MLlib's naive baye's classification training - posted by jatinpreet <ja...@gmail.com> on 2014/09/22 14:48:08 UTC, 7 replies.
- Re: GraphX : AssertionError - posted by Keith Massey <ke...@digitalreasoning.com> on 2014/09/22 17:10:37 UTC, 0 replies.
- Setup an huge Unserializable Object in a mapper - posted by matthes <md...@sensenetworks.com> on 2014/09/22 18:11:59 UTC, 3 replies.
- spark time out - posted by Chen Song <ch...@gmail.com> on 2014/09/22 18:22:10 UTC, 2 replies.
- Is there any way (in Java) to make a JavaRDD from an iterable - posted by Steve Lewis <lo...@gmail.com> on 2014/09/22 18:22:14 UTC, 3 replies.
- SparkSQL: Key not valid while running TPC-H - posted by Samay <sm...@gmail.com> on 2014/09/22 19:25:00 UTC, 0 replies.
- Accumulator Immutability? - posted by Vikram Kalabi <vi...@gmail.com> on 2014/09/22 20:30:01 UTC, 3 replies.
- Running Spark in Local Mode vs. Single Node Cluster - posted by kriskalish <kr...@kalish.net> on 2014/09/22 21:20:07 UTC, 0 replies.
- Spark SQL CLI - posted by gtinside <gt...@gmail.com> on 2014/09/22 22:02:29 UTC, 6 replies.
- The wikipedia Extraction (WEX) Dataset - posted by daidong <da...@gmail.com> on 2014/09/22 22:37:47 UTC, 1 replies.
- RDD data checkpoint cleaning - posted by RodrigoB <ro...@aspect.com> on 2014/09/22 23:10:14 UTC, 3 replies.
- Streaming: HdfsWordCount does not print any output - posted by SK <sk...@gmail.com> on 2014/09/22 23:44:34 UTC, 1 replies.
- Cancelled Key exception - posted by ab...@thomsonreuters.com on 2014/09/23 02:43:39 UTC, 0 replies.
- Java Implementation of StreamingContext.fileStream - posted by Michael Quinlan <mq...@gmail.com> on 2014/09/23 06:17:33 UTC, 2 replies.
- RE: Exception with SparkSql and Avro - posted by "Zalzberg, Idan (Agoda)" <Id...@agoda.com> on 2014/09/23 07:07:59 UTC, 2 replies.
- Change number of workers and memory - posted by Dhimant <dh...@gmail.com> on 2014/09/23 07:20:06 UTC, 1 replies.
- Why recommend 2-3 tasks per CPU core ? - posted by myasuka <my...@live.com> on 2014/09/23 07:58:29 UTC, 3 replies.
- Re: "sbt/sbt run" command returns a JVM problem - posted by christy <76...@qq.com> on 2014/09/23 08:05:48 UTC, 0 replies.
- Recommended ways to pass functions - posted by Kevin Jung <it...@samsung.com> on 2014/09/23 09:29:40 UTC, 1 replies.
- Where can I find the module diagram of SPARK? - posted by Theodore Si <sj...@gmail.com> on 2014/09/23 09:46:32 UTC, 2 replies.
- Access resources from jar-local resources folder - posted by Roberto Coluccio <ro...@gmail.com> on 2014/09/23 10:09:34 UTC, 1 replies.
- spark.local.dir and spark.worker.dir not used - posted by Priya Ch <le...@gmail.com> on 2014/09/23 12:31:15 UTC, 5 replies.
- Re: NullPointerException on reading checkpoint files - posted by RodrigoB <ro...@aspect.com> on 2014/09/23 12:52:47 UTC, 1 replies.
- Re: broadcast variable get cleaned by ContextCleaner unexpectedly ? - posted by RodrigoB <ro...@aspect.com> on 2014/09/23 13:37:08 UTC, 0 replies.
- Error launching spark application from Windows to Linux YARN Cluster - Could not find or load main class org.apache.spark.deploy.yarn.ExecutorLauncher - posted by dxrodri <dx...@gmail.com> on 2014/09/23 14:43:27 UTC, 0 replies.
- TorrentBroadcast causes java.io.IOException: unexpected exception type - posted by Arun Ahuja <aa...@gmail.com> on 2014/09/23 15:38:36 UTC, 0 replies.
- Spark 1.1.0 on EC2 - posted by Gilberto Lira <gi...@scanboo.com.br> on 2014/09/23 16:11:01 UTC, 0 replies.
- recommended values for spark driver memory? - posted by Greg Hill <gr...@RACKSPACE.COM> on 2014/09/23 17:06:43 UTC, 0 replies.
- MLlib, what online(streaming) algorithms are available? - posted by "aka.fe2s" <ak...@gmail.com> on 2014/09/23 17:21:09 UTC, 1 replies.
- access javaobject in rdd map - posted by jamborta <ja...@gmail.com> on 2014/09/23 17:48:45 UTC, 4 replies.
- Multiple Kafka Receivers and Union - posted by Matt Narrell <ma...@gmail.com> on 2014/09/23 18:19:32 UTC, 11 replies.
- SparkSQL: Freezing while running TPC-H query 5 - posted by Samay <sm...@gmail.com> on 2014/09/23 18:33:14 UTC, 1 replies.
- Spark 1.1.0 hbase_inputformat.py not work - posted by Gilberto Lira <gi...@scanboo.com.br> on 2014/09/23 18:39:05 UTC, 1 replies.
- Transient association error on a 3 nodes cluster - posted by Edwin <al...@yahoo.com> on 2014/09/23 20:41:42 UTC, 0 replies.
- Re: spark1.0 principal component analysis - posted by st553 <st...@gmail.com> on 2014/09/23 21:49:04 UTC, 1 replies.
- Spark SQL 1.1.0 - large insert into parquet runs out of memory - posted by Dan Dietterich <da...@yahoo.com.INVALID> on 2014/09/23 22:36:15 UTC, 2 replies.
- HdfsWordCount only counts some of the words - posted by SK <sk...@gmail.com> on 2014/09/23 23:04:11 UTC, 4 replies.
- General question on persist - posted by Arun Ahuja <aa...@gmail.com> on 2014/09/23 23:08:34 UTC, 3 replies.
- Spark Code to read RCFiles - posted by Pramod Biligiri <pr...@gmail.com> on 2014/09/23 23:25:43 UTC, 4 replies.
- Sorting a table in Spark - posted by "Areg Baghdasaryan (BLOOMBERG/ 731 LEX -)" <ab...@bloomberg.net> on 2014/09/23 23:34:13 UTC, 0 replies.
- Sorting a Table in Spark RDD - posted by "Areg Baghdasaryan (BLOOMBERG/ 731 LEX -)" <ab...@bloomberg.net> on 2014/09/23 23:40:25 UTC, 1 replies.
- Running Spark from an Intellij worksheet - akka.version error - posted by adrian <ad...@gmail.com> on 2014/09/23 23:59:35 UTC, 0 replies.
- Worker Random Port - posted by Paul Magid <Pa...@toyota.com> on 2014/09/24 00:10:02 UTC, 1 replies.
- SQL status code to indicate success or failure of query - posted by Du Li <li...@yahoo-inc.com.INVALID> on 2014/09/24 01:55:43 UTC, 1 replies.
- Does anyone have experience with using Hadoop InputFormats? - posted by Steve Lewis <lo...@gmail.com> on 2014/09/24 02:13:13 UTC, 10 replies.
- IOException running streaming job - posted by Emil Gustafsson <em...@cellfish.se> on 2014/09/24 02:18:23 UTC, 1 replies.
- Adding extra classpath - posted by Victor Tso-Guillen <vt...@paxata.com> on 2014/09/24 02:53:22 UTC, 0 replies.
- Garbage Collection - posted by dizzy5112 <da...@gmail.com> on 2014/09/24 03:06:00 UTC, 0 replies.
- java.lang.OutOfMemoryError while running SVD MLLib example - posted by "sbirari@wynyardgroup.com" <sb...@wynyardgroup.com> on 2014/09/24 05:54:06 UTC, 3 replies.
- Converting one RDD to another - posted by Deep Pradhan <pr...@gmail.com> on 2014/09/24 06:33:45 UTC, 2 replies.
- how long does it take executing ./sbt/sbt assembly - posted by christy <76...@qq.com> on 2014/09/24 07:02:07 UTC, 2 replies.
- Re: how long does it take executing ./sbt/sbt assembly - posted by Tobias Pfeiffer <tg...@preferred.jp> on 2014/09/24 07:20:40 UTC, 0 replies.
- java.lang.stackoverflowerror when running Spark shell - posted by mrshen <sc...@gmail.com> on 2014/09/24 07:25:30 UTC, 0 replies.
- Can not see any spark metrics on ganglia-web - posted by tsingfu <ha...@gmail.com> on 2014/09/24 07:28:28 UTC, 1 replies.
- [Streaming] Non-blocking recommendation in custom receiver documentation and KinesisReceiver's worker.run blocking calll - posted by Aniket Bhatnagar <an...@gmail.com> on 2014/09/24 09:29:38 UTC, 0 replies.
- sortByKey trouble - posted by david <da...@free.fr> on 2014/09/24 10:29:34 UTC, 4 replies.
- All-time stream re-processing - posted by Tobias Pfeiffer <tg...@preferred.jp> on 2014/09/24 10:59:53 UTC, 1 replies.
- Spark Streaming - posted by Reddy Raja <ar...@gmail.com> on 2014/09/24 12:35:23 UTC, 1 replies.
- Re: java.lang.NumberFormatException while starting spark-worker - posted by ji...@wipro.com on 2014/09/24 13:11:17 UTC, 6 replies.
- find subgraph in Graphx - posted by uuree <uu...@yahoo.com> on 2014/09/24 13:13:19 UTC, 0 replies.
- Does Spark Driver works with HDFS in HA mode - posted by Petr Novak <os...@gmail.com> on 2014/09/24 13:35:14 UTC, 2 replies.
- How to sort rdd filled with existing data structures? - posted by Tao Xiao <xi...@gmail.com> on 2014/09/24 16:07:22 UTC, 2 replies.
- Spark Cassandra Connector Issue and performance - posted by pouryas <po...@adbrain.com> on 2014/09/24 16:10:43 UTC, 0 replies.
- Optimal Cluster Setup for Spark - posted by pouryas <po...@adbrain.com> on 2014/09/24 16:14:17 UTC, 0 replies.
- parquetFile and wilcards - posted by Marius Soutier <mp...@gmail.com> on 2014/09/24 16:46:38 UTC, 4 replies.
- Specifying Spark Executor Java options using Spark Submit - posted by Arun Ahuja <aa...@gmail.com> on 2014/09/24 16:52:46 UTC, 1 replies.
- RDD of Iterable[String] - posted by Deep Pradhan <pr...@gmail.com> on 2014/09/24 17:21:03 UTC, 2 replies.
- Spark : java.io.NotSerializableException: org.apache.hadoop.hbase.client.Result - posted by Madabhattula Rajesh Kumar <mr...@gmail.com> on 2014/09/24 17:32:31 UTC, 0 replies.
- Re: - posted by Ted Yu <yu...@gmail.com> on 2014/09/24 17:49:55 UTC, 11 replies.
- Executor/Worker stuck at parquet.hadoop.ParquetFileReader.readNextRowGroup and never finishes. - posted by Jianshi Huang <ji...@gmail.com> on 2014/09/24 17:50:07 UTC, 0 replies.
- WindowedDStreams and hierarchies - posted by Pablo Medina <pa...@gmail.com> on 2014/09/24 18:08:50 UTC, 0 replies.
- Re: task getting stuck - posted by Ted Yu <yu...@gmail.com> on 2014/09/24 18:29:59 UTC, 2 replies.
- Spark Hbase - posted by Madabhattula Rajesh Kumar <mr...@gmail.com> on 2014/09/24 18:39:58 UTC, 2 replies.
- Spark with YARN - posted by Raghuveer Chanda <ra...@gmail.com> on 2014/09/24 19:25:48 UTC, 9 replies.
- RDD save as Seq File - posted by Shay Seng <sh...@urbanengines.com> on 2014/09/24 19:41:35 UTC, 1 replies.
- java.lang.IllegalArgumentException: Size exceeds Integer.MAX_VALUE - posted by Victor Tso-Guillen <vt...@paxata.com> on 2014/09/24 20:10:40 UTC, 1 replies.
- spark.sql.autoBroadcastJoinThreshold - posted by sridhar1135 <sr...@yahoo.com> on 2014/09/24 20:13:23 UTC, 0 replies.
- Null values in pyspark Row - posted by jamborta <ja...@gmail.com> on 2014/09/24 20:56:20 UTC, 1 replies.
- persist before or after checkpoint? - posted by Shay Seng <sh...@urbanengines.com> on 2014/09/24 21:26:44 UTC, 0 replies.
- Hive Avro table fail (pyspark) - posted by ilyagluk <il...@gmail.com> on 2014/09/24 21:28:07 UTC, 0 replies.
- Question About Submit Application - posted by danilopds <da...@gmail.com> on 2014/09/24 22:12:13 UTC, 4 replies.
- Re: Spark Streaming Twitter Example Error - posted by danilopds <da...@gmail.com> on 2014/09/24 22:14:29 UTC, 0 replies.
- Re: Memory/Network Intensive Workload - posted by danilopds <da...@gmail.com> on 2014/09/24 22:16:40 UTC, 0 replies.
- Spark Streaming unable to handle production Kafka load - posted by maddenpj <ma...@gmail.com> on 2014/09/24 23:14:52 UTC, 2 replies.
- Re: Logging in Spark through YARN. - posted by Vipul Pandey <vi...@gmail.com> on 2014/09/24 23:17:21 UTC, 0 replies.
- MLUtils.loadLibSVMFile error - posted by Sameer Tilak <ss...@live.com> on 2014/09/25 00:02:43 UTC, 5 replies.
- Bug in JettyUtils? - posted by Jim Donahue <jd...@adobe.com> on 2014/09/25 00:11:32 UTC, 0 replies.
- Processing multiple request in cluster - posted by Subacini B <su...@gmail.com> on 2014/09/25 01:20:18 UTC, 2 replies.
- Has anyone seen java.nio.ByteBuffer.wrap(ByteBuffer.java:392) - posted by Steve Lewis <lo...@gmail.com> on 2014/09/25 01:32:20 UTC, 0 replies.
- experimental solution to nesting RDDs - posted by ldmtwo <ld...@gmail.com> on 2014/09/25 01:33:50 UTC, 0 replies.
- Spark SQL use of alias in where clause - posted by Du Li <li...@yahoo-inc.com.INVALID> on 2014/09/25 02:58:34 UTC, 3 replies.
- Re: How to use FlumeInputDStream in spark cluster? - posted by julyfire <he...@gmail.com> on 2014/09/25 04:51:52 UTC, 0 replies.
- YARN ResourceManager and Hadoop NameNode Web UI not visible in port 8088, port 50070 - posted by Raghuveer Chanda <ra...@gmail.com> on 2014/09/25 06:00:44 UTC, 2 replies.
- Using one sql query's result inside another sql query - posted by twinkle sachdeva <tw...@gmail.com> on 2014/09/25 07:18:51 UTC, 3 replies.
- How to increase number of Active Stages - posted by Alexey Romanchuk <al...@gmail.com> on 2014/09/25 08:30:46 UTC, 2 replies.
- Log hdfs blocks sending - posted by Alexey Romanchuk <al...@gmail.com> on 2014/09/25 09:09:47 UTC, 3 replies.
- Re: quick start guide: building a standalone scala program - posted by christy <76...@qq.com> on 2014/09/25 09:22:04 UTC, 2 replies.
- Memory used in Spark-0.9.0-incubating - posted by 王晓雨 <wa...@jd.com> on 2014/09/25 12:21:37 UTC, 4 replies.
- Windowed Operations - posted by Diego <di...@gmail.com> on 2014/09/25 13:45:46 UTC, 0 replies.
- java.io.FileNotFoundException in usercache - posted by Egor Pahomov <pa...@gmail.com> on 2014/09/25 14:18:48 UTC, 0 replies.
- SPARK 1.1.0 on yarn-cluster and external JARs - posted by rzykov <rz...@gmail.com> on 2014/09/25 14:25:49 UTC, 2 replies.
- Update gcc version ,Still snappy error. - posted by buring <qy...@gmail.com> on 2014/09/25 15:43:12 UTC, 0 replies.
- Pregel messages serialized in local machine? - posted by Cheuk Lam <ch...@hotmail.com> on 2014/09/25 15:52:46 UTC, 1 replies.
- Re: Spark Hive max key length is 767 bytes - posted by "Arthur.hk.chan@gmail.com" <ar...@gmail.com> on 2014/09/25 16:14:34 UTC, 1 replies.
- Systematic error when re-starting Spark stream unless I delete all checkpoints - posted by Svend <sv...@gmail.com> on 2014/09/25 16:20:40 UTC, 2 replies.
- how to run spark job on yarn with jni lib? - posted by taqilabon <g9...@gmail.com> on 2014/09/25 17:34:36 UTC, 5 replies.
- Yarn number of containers - posted by jamborta <ja...@gmail.com> on 2014/09/25 17:55:15 UTC, 4 replies.
- VertexRDD partition imbalance - posted by Larry Xiao <xi...@sjtu.edu.cn> on 2014/09/25 18:41:12 UTC, 0 replies.
- Working on LZOP Files - posted by Harsha HN <99...@gmail.com> on 2014/09/25 18:44:58 UTC, 1 replies.
- Optimal Partition Strategy - posted by "Muttineni, Vinay" <vm...@ebay.com> on 2014/09/25 19:37:27 UTC, 1 replies.
- Spark Streaming + Actors - posted by Madabhattula Rajesh Kumar <mr...@gmail.com> on 2014/09/25 20:08:50 UTC, 1 replies.
- Add Meetup - posted by Brian Husted <br...@tetraconcepts.com> on 2014/09/25 20:49:01 UTC, 0 replies.
- Re: K-means faster on Mahout then on Spark - posted by bhusted <br...@gmail.com> on 2014/09/25 21:28:55 UTC, 1 replies.
- SPARK UI - Details post job processiong - posted by Harsha HN <99...@gmail.com> on 2014/09/25 21:55:02 UTC, 5 replies.
- "Ungroup" data - posted by Luis Guerra <lu...@gmail.com> on 2014/09/25 22:17:22 UTC, 0 replies.
- Spark streaming - submit new job version - posted by demian <db...@despegar.com> on 2014/09/25 22:45:42 UTC, 0 replies.
- Spark Streaming: No parallelism in writing to database (MySQL) - posted by maddenpj <ma...@gmail.com> on 2014/09/25 22:56:14 UTC, 4 replies.
- Kryo UnsupportedOperationException - posted by Sandy Ryza <sa...@cloudera.com> on 2014/09/25 23:12:41 UTC, 1 replies.
- Shuffle files - posted by SK <sk...@gmail.com> on 2014/09/26 01:20:19 UTC, 1 replies.
- Is it possible to use Parquet with Dremel encoding - posted by matthes <md...@sensenetworks.com> on 2014/09/26 02:05:43 UTC, 7 replies.
- spark-ec2 ERROR: Line magic function `%matplotlib` not found - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2014/09/26 02:11:01 UTC, 0 replies.
- Job cancelled because SparkContext was shut down - posted by jamborta <ja...@gmail.com> on 2014/09/26 03:02:21 UTC, 1 replies.
- flume spark streaming receiver host random - posted by centerqi hu <ce...@gmail.com> on 2014/09/26 04:32:12 UTC, 3 replies.
- Re: spark-ec2 ERROR: Line magic function `%matplotlib` not found - posted by Davies Liu <da...@databricks.com> on 2014/09/26 06:58:58 UTC, 0 replies.
- Parallel spark jobs on standalone cluster - posted by Sarath Chandra <sa...@algofusiontech.com> on 2014/09/26 07:03:13 UTC, 0 replies.
- Spark Streaming: foreachRDD network output - posted by Jesper Lundgren <ko...@gmail.com> on 2014/09/26 07:35:04 UTC, 0 replies.
- Access by name in "tuples" in Scala with Spark - posted by rzykov <rz...@gmail.com> on 2014/09/26 09:31:56 UTC, 1 replies.
- Spark SQL question: is cached SchemaRDD storage controlled by "spark.storage.memoryFraction"? - posted by Haopu Wang <HW...@qilinsoft.com> on 2014/09/26 10:04:18 UTC, 2 replies.
- Access file name in map function - posted by Shekhar Bansal <sh...@yahoo.com.INVALID> on 2014/09/26 12:45:10 UTC, 1 replies.
- Re: Issue with Spark-1.1.0 and the start-thriftserver.sh script - posted by Cheng Lian <li...@gmail.com> on 2014/09/26 13:23:06 UTC, 0 replies.
- executorAdded event to DAGScheduler - posted by praveen seluka <pr...@gmail.com> on 2014/09/26 14:02:01 UTC, 3 replies.
- How to run hive scripts pro-grammatically in Spark 1.1.0 ? - posted by Sherine <sh...@gmail.com> on 2014/09/26 14:29:44 UTC, 1 replies.
- java.io.IOException Error in task deserialization - posted by Arun Ahuja <aa...@gmail.com> on 2014/09/26 15:11:30 UTC, 3 replies.
- mappartitions data size - posted by jamborta <ja...@gmail.com> on 2014/09/26 16:19:53 UTC, 1 replies.
- Build error when using spark with breeze - posted by Jaonary Rabarisoa <ja...@gmail.com> on 2014/09/26 17:42:13 UTC, 6 replies.
- How to do operations on multiple RDD's - posted by Johan Stenberg <jo...@gmail.com> on 2014/09/26 17:55:24 UTC, 1 replies.
- Re: spark-ec2 script with Tachyon - posted by mrm <ma...@skimlinks.com> on 2014/09/26 18:28:12 UTC, 0 replies.
- problem with spark-ec2 launch script Re: spark-ec2 ERROR: Line magic function `%matplotlib` not found - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2014/09/26 20:18:38 UTC, 2 replies.
- SF Scala: Spark and Machine Learning Videos - posted by Alexy Khrabrov <al...@scalable.pro> on 2014/09/26 21:21:51 UTC, 0 replies.
- Communication between threads within a worker - posted by "lokesh.gidra" <lo...@gmail.com> on 2014/09/26 21:36:35 UTC, 0 replies.
- SparkSQL: map type MatchError when inserting into Hive table - posted by Du Li <li...@yahoo-inc.com.INVALID> on 2014/09/27 01:48:34 UTC, 4 replies.
- RDD logic and control - posted by pop1998 <po...@gmail.com> on 2014/09/27 19:09:42 UTC, 0 replies.
- MLlib 1.2 New & Interesting Features - posted by Krishna Sankar <ks...@gmail.com> on 2014/09/27 21:15:30 UTC, 2 replies.
- Re: Retrieve dataset of Big Data Benchmark - posted by Tom <th...@gmail.com> on 2014/09/27 23:49:23 UTC, 0 replies.
- iPython notebook ec2 cluster matlabplot not found? - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2014/09/28 01:33:39 UTC, 4 replies.
- yarn does not accept job in cluster mode - posted by jamborta <ja...@gmail.com> on 2014/09/28 03:38:09 UTC, 2 replies.
- Re: New user question on Spark SQL: can I really use Spark SQL like a normal DB? - posted by jamborta <ja...@gmail.com> on 2014/09/28 04:01:12 UTC, 0 replies.
- Re: Build spark with Intellij IDEA 13 - posted by maddenpj <ma...@gmail.com> on 2014/09/28 05:01:11 UTC, 1 replies.
- PageRank execution imbalance, might hurt performance by 6x - posted by Larry Xiao <xi...@sjtu.edu.cn> on 2014/09/28 05:25:39 UTC, 0 replies.
- How to use multi thread in RDD map function ? - posted by myasuka <my...@live.com> on 2014/09/28 05:44:02 UTC, 3 replies.
- problem with data locality api - posted by qinwei <we...@dewmobile.net> on 2014/09/28 08:05:05 UTC, 1 replies.
- problem with patitioning - posted by qinwei <we...@dewmobile.net> on 2014/09/28 08:36:19 UTC, 2 replies.
- 回复: RE: problem with data locality api - posted by qinwei <we...@dewmobile.net> on 2014/09/28 08:55:08 UTC, 0 replies.
- spark multi-node cluster - posted by codeoedoc <co...@gmail.com> on 2014/09/28 09:36:00 UTC, 2 replies.
- How to do broadcast join in SparkSQL - posted by Jianshi Huang <ji...@gmail.com> on 2014/09/28 10:55:59 UTC, 2 replies.
- [MLlib] LogisticRegressionWithSGD and LogisticRegressionWithLBFGS converge with different weights. - posted by Yanbo Liang <ya...@gmail.com> on 2014/09/28 11:48:39 UTC, 3 replies.
- [SF Machine Learning meetup] talk by Prof. C J Lin, large-scale linear classification: status and changllenges - posted by Chester At Work <ch...@alpinenow.com> on 2014/09/28 15:39:30 UTC, 0 replies.
- driver memory management - posted by Brad Miller <bm...@eecs.berkeley.edu> on 2014/09/28 19:51:53 UTC, 1 replies.
- view not supported in spark thrift server? - posted by Du Li <li...@yahoo-inc.com.INVALID> on 2014/09/28 20:59:10 UTC, 2 replies.
- Spark meetup on Oct 15 in NYC - posted by Reynold Xin <rx...@databricks.com> on 2014/09/28 21:56:35 UTC, 0 replies.
- Spark SQL question: how to control the storage level of cached SchemaRDD? - posted by Haopu Wang <HW...@qilinsoft.com> on 2014/09/29 03:39:25 UTC, 2 replies.
- Re: Kinesis receiver & spark streaming partition - posted by Wei Liu <we...@stellarloyalty.com> on 2014/09/29 05:24:29 UTC, 0 replies.
- aggregateByKey vs combineByKey - posted by David Rowe <da...@gmail.com> on 2014/09/29 08:59:35 UTC, 2 replies.
- SQL queries fail in 1.2.0-SNAPSHOT - posted by "Wang, Daoyuan" <da...@intel.com> on 2014/09/29 10:17:51 UTC, 4 replies.
- REPL like interface for Spark - posted by IT CTO <go...@gmail.com> on 2014/09/29 10:19:16 UTC, 5 replies.
- Re: Workers disconnected from master sometimes and never reconnect back - posted by Romi Kuntsman <ro...@totango.com> on 2014/09/29 13:36:45 UTC, 1 replies.
- The confusion order of rows in SVD matrix ? - posted by buring <qy...@gmail.com> on 2014/09/29 15:58:52 UTC, 1 replies.
- Unresolved attributes: SparkSQL on the schemaRDD - posted by "vdiwakar.malladi" <vd...@gmail.com> on 2014/09/29 17:08:28 UTC, 7 replies.
- Spark SQL + Hive + JobConf NoClassDefFoundError - posted by Patrick McGloin <mc...@gmail.com> on 2014/09/29 17:41:14 UTC, 0 replies.
- Simple Question: Spark Streaming Applications - posted by Saiph Kappa <sa...@gmail.com> on 2014/09/29 19:15:15 UTC, 2 replies.
- When to start optimizing for GC? - posted by Ashish Jain <as...@gmail.com> on 2014/09/29 19:43:22 UTC, 0 replies.
- Ack RabbitMQ messages after processing through Spark Streaming - posted by khaledh <kh...@gmail.com> on 2014/09/29 19:45:35 UTC, 1 replies.
- Window comparison matching using the sliding window functionality: feasibility - posted by nitinkak001 <ni...@gmail.com> on 2014/09/29 20:24:47 UTC, 2 replies.
- Schema change on Spark Hive (Parquet file format) table not working - posted by "barge.nilesh" <ba...@gmail.com> on 2014/09/29 22:43:37 UTC, 1 replies.
- Using addFile with pipe on a yarn cluster - posted by esamanas <ev...@gmail.com> on 2014/09/29 22:52:51 UTC, 0 replies.
- about partition number - posted by anny9699 <an...@gmail.com> on 2014/09/29 23:01:55 UTC, 3 replies.
- partitions number with variable number of cores - posted by Jonathan Esterhazy <je...@groupon.com> on 2014/09/30 00:25:02 UTC, 0 replies.
- Spark Language Integrated SQL for join on expression - posted by Benyi Wang <be...@gmail.com> on 2014/09/30 00:39:57 UTC, 1 replies.
- in memory assumption in cogroup? - posted by Koert Kuipers <ko...@tresata.com> on 2014/09/30 01:02:48 UTC, 1 replies.
- ExecutorLostFailure kills sparkcontext - posted by jamborta <ja...@gmail.com> on 2014/09/30 01:45:51 UTC, 1 replies.
- newbie system architecture problem, trouble using streaming and RDD.pipe() - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2014/09/30 02:56:09 UTC, 0 replies.
- Re: shuffle memory requirements - posted by maddenpj <ma...@gmail.com> on 2014/09/30 03:20:44 UTC, 1 replies.
- Reading from HBase is too slow - posted by Tao Xiao <xi...@gmail.com> on 2014/09/30 04:21:23 UTC, 6 replies.
- Spark SQL question: why build hashtable for both sides in HashOuterJoin? - posted by Haopu Wang <HW...@qilinsoft.com> on 2014/09/30 05:36:55 UTC, 3 replies.
- SparkSQL DataType mappings - posted by Costin Leau <co...@gmail.com> on 2014/09/30 11:05:29 UTC, 0 replies.
- Spark Streaming for time consuming job - posted by Eko Susilo <ek...@gmail.com> on 2014/09/30 11:52:17 UTC, 0 replies.
- Fwd: Actual Probabilities when Using Naive Bayes classifier - posted by Marius FETEANU <ma...@sien.com> on 2014/09/30 11:55:48 UTC, 0 replies.
- Getting erorrs in spark worker nodes - posted by Murthy Chelankuri <km...@gmail.com> on 2014/09/30 13:39:51 UTC, 0 replies.
- Parallel spark jobs on mesos cluster - posted by Sarath Chandra <sa...@algofusiontech.com> on 2014/09/30 16:37:39 UTC, 0 replies.
- Re: S3 - Extra $_folder$ files for every directory node - posted by pouryas <po...@adbrain.com> on 2014/09/30 16:43:52 UTC, 1 replies.
- registering Array of CompactBuffer to Kryo - posted by Andras Barjak <an...@lynxanalytics.com> on 2014/09/30 17:33:55 UTC, 0 replies.
- Re: Spark SQL and Hive tables - posted by Chen Song <ch...@gmail.com> on 2014/09/30 17:56:11 UTC, 0 replies.
- Installation question - posted by mohan <ra...@gmail.com> on 2014/09/30 18:32:35 UTC, 0 replies.
- timestamp not implemented yet - posted by tonsat <to...@gmail.com> on 2014/09/30 18:45:16 UTC, 2 replies.
- MLLib ALS question - posted by Alex T <ch...@gmail.com> on 2014/09/30 19:44:48 UTC, 1 replies.
- MLLib: Missing value imputation - posted by Sameer Tilak <ss...@live.com> on 2014/09/30 20:26:52 UTC, 1 replies.
- Multiple exceptions in Spark Streaming - posted by Shaikh Riyaz <sh...@gmail.com> on 2014/09/30 20:33:23 UTC, 0 replies.
- pyspark cassandra examples - posted by David Vincelli <da...@vantageanalytics.com> on 2014/09/30 20:37:13 UTC, 2 replies.
- how to get actual count from as long from JavaDStream ? - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2014/09/30 21:42:32 UTC, 1 replies.
- processing large number of files - posted by SK <sk...@gmail.com> on 2014/09/30 22:59:07 UTC, 0 replies.
- Handling tree reduction algorithm with Spark in parallel - posted by Boromir Widas <vc...@gmail.com> on 2014/09/30 23:12:38 UTC, 1 replies.