You are viewing a plain text version of this content. The canonical link for it is here.
- Re: how to convert a sequence of TimeStamp to a dataframe - posted by Ted Yu <yu...@gmail.com> on 2015/08/01 00:22:55 UTC, 1 replies.
- Re: Has anybody ever tried running Spark Streaming on 500 text streams? - posted by Tathagata Das <td...@databricks.com> on 2015/08/01 00:55:31 UTC, 0 replies.
- RandomForest in Pyspark (version 1.4.1) - posted by SK <sk...@gmail.com> on 2015/08/01 03:36:24 UTC, 0 replies.
- Re: How to create Spark DataFrame using custom Hadoop InputFormat? - posted by Umesh Kacha <um...@gmail.com> on 2015/08/01 06:52:07 UTC, 0 replies.
- Re: flatMap output on disk / flatMap memory overhead - posted by Puneet Kapoor <pu...@gmail.com> on 2015/08/01 11:41:04 UTC, 0 replies.
- Re: Spark SQL DataFrame: Nullable column and filtering - posted by Martin Senne <ma...@googlemail.com> on 2015/08/01 13:13:21 UTC, 0 replies.
- About memory leak in spark 1.4.1 - posted by Sea <26...@qq.com> on 2015/08/01 16:41:57 UTC, 8 replies.
- Re: Does anyone have experience with using Hadoop InputFormats? - posted by "Antsy.Rao" <an...@gmail.com> on 2015/08/01 18:19:00 UTC, 0 replies.
- No event logs in yarn-cluster mode - posted by Akmal Abbasov <ak...@icloud.com> on 2015/08/01 18:25:16 UTC, 2 replies.
- Fwd: How does the # of tasks affect # of threads? - posted by Connor Zanin <cn...@udel.edu> on 2015/08/01 22:47:29 UTC, 4 replies.
- Re: Spark Number of Partitions Recommendations - posted by Ruslan Dautkhanov <da...@gmail.com> on 2015/08/01 23:14:16 UTC, 1 replies.
- TCP/IP speedup - posted by Simon Edelhaus <ed...@gmail.com> on 2015/08/02 00:24:19 UTC, 5 replies.
- Re: unsubscribe - posted by Akhil Das <ak...@sigmoidanalytics.com> on 2015/08/02 09:56:47 UTC, 2 replies.
- Re: Does Spark Streaming need to list all the files in a directory? - posted by Akhil Das <ak...@sigmoidanalytics.com> on 2015/08/02 10:03:26 UTC, 0 replies.
- Re: Encryption on RDDs or in-memory/cache on Apache Spark - posted by Akhil Das <ak...@sigmoidanalytics.com> on 2015/08/02 10:23:07 UTC, 1 replies.
- Re: About memory leak in spark 1.4.1 - posted by Sea <26...@qq.com> on 2015/08/02 11:16:47 UTC, 4 replies.
- spark no output - posted by Pa Rö <pa...@googlemail.com> on 2015/08/02 16:04:25 UTC, 4 replies.
- Re: How to increase parallelism of a Spark cluster? - posted by Sujit Pal <su...@gmail.com> on 2015/08/02 19:17:09 UTC, 10 replies.
- how to ignore MatchError then processing a large json file in spark-sql - posted by fuellee lee <li...@gmail.com> on 2015/08/02 20:27:53 UTC, 1 replies.
- [SparkScore]Performance portal for Apache Spark - WW31 - posted by "Huang, Jie" <ji...@intel.com> on 2015/08/03 03:21:32 UTC, 0 replies.
- spark cluster setup - posted by Angel Angel <ar...@gmail.com> on 2015/08/03 04:16:07 UTC, 3 replies.
- Cannot Import Package (spark-csv) - posted by billchambers <wc...@ischool.berkeley.edu> on 2015/08/03 05:33:42 UTC, 5 replies.
- Checkpoint file not found - posted by Anand Nalya <an...@gmail.com> on 2015/08/03 06:14:02 UTC, 2 replies.
- Extremely poor predictive performance with RF in mllib - posted by pkphlam <pk...@gmail.com> on 2015/08/03 07:20:56 UTC, 4 replies.
- RE: SparkLauncher not notified about finished job - hangs infinitely. - posted by Tomasz Guziałek <To...@HumanInference.com> on 2015/08/03 09:49:20 UTC, 0 replies.
- spark --files permission error - posted by Shushant Arora <sh...@gmail.com> on 2015/08/03 10:34:05 UTC, 0 replies.
- Is it possible to disable AM page proxy in Yarn client mode? - posted by Rex Xiong <by...@gmail.com> on 2015/08/03 10:52:32 UTC, 1 replies.
- Running multiple batch jobs in parallel using Spark on Mesos - posted by Akash Mishra <ak...@gmail.com> on 2015/08/03 10:55:10 UTC, 1 replies.
- org.apache.spark.SparkException: Detected yarn-cluster mode, but isn't running on a cluster. Deployment to YARN is not supported directly by SparkContext. Please use spark-submit - posted by Rajeshkumar J <ra...@gmail.com> on 2015/08/03 13:47:39 UTC, 1 replies.
- spark streaming program failed on Spark 1.4.1 - posted by Netwaver <wa...@163.com> on 2015/08/03 15:36:42 UTC, 1 replies.
- How to calculate standard deviation of grouped data in a DataFrame? - posted by the3rdNotch <st...@notch.bz> on 2015/08/03 16:30:35 UTC, 0 replies.
- How do I Process Streams that span multiple lines? - posted by Spark Enthusiast <sp...@yahoo.in> on 2015/08/03 16:57:05 UTC, 3 replies.
- Re: How to control Spark Executors from getting Lost when using YARN client mode? - posted by Umesh Kacha <um...@gmail.com> on 2015/08/03 17:59:45 UTC, 1 replies.
- large scheduler delay in pyspark - posted by gen tang <ge...@gmail.com> on 2015/08/03 18:00:51 UTC, 3 replies.
- Re: HiveQL to SparkSQL - posted by Bigdata techguy <bi...@gmail.com> on 2015/08/03 18:12:04 UTC, 0 replies.
- Does RDD.cartesian involve shuffling? - posted by Meihua Wu <ro...@gmail.com> on 2015/08/03 18:56:31 UTC, 3 replies.
- Standalone Cluster Local Authentication - posted by MrJew <ko...@gmail.com> on 2015/08/03 19:05:41 UTC, 2 replies.
- EOFException when transmitting a class that extends Externalizable - posted by Michael Knapp <mi...@gmail.com> on 2015/08/03 19:10:15 UTC, 0 replies.
- Re: Package Release Annoucement: Spark SQL on HBase "Astro" - posted by Ted Yu <yu...@gmail.com> on 2015/08/03 19:32:54 UTC, 1 replies.
- Combine code for RDD and DStream - posted by Sidd S <ss...@gmail.com> on 2015/08/03 19:42:25 UTC, 1 replies.
- Writing to HDFS - posted by Jasleen Kaur <ja...@gmail.com> on 2015/08/03 21:49:17 UTC, 2 replies.
- Re: Python, Spark and HBase - posted by ericbless <er...@yahoo.com> on 2015/08/03 22:34:24 UTC, 0 replies.
- shutdown local hivecontext? - posted by Cesar Flores <ce...@gmail.com> on 2015/08/04 00:09:22 UTC, 3 replies.
- NullPointException Help while using accumulators - posted by Anubhav Agarwal <an...@gmail.com> on 2015/08/04 00:13:25 UTC, 4 replies.
- Contributors group and starter task - posted by Namit Katariya <ka...@gmail.com> on 2015/08/04 00:50:08 UTC, 2 replies.
- SparkR broadcast variables - posted by Deborah Siegel <de...@gmail.com> on 2015/08/04 01:59:07 UTC, 1 replies.
- How does DataFrame except work? - posted by Srikanth <sr...@gmail.com> on 2015/08/04 02:05:46 UTC, 0 replies.
- Multiple UpdateStateByKey Functions in the same job? - posted by swetha <sw...@gmail.com> on 2015/08/04 03:32:31 UTC, 1 replies.
- Topology.py -- Cannot run on Spark Gateway on Cloudera 5.4.4. - posted by Upen N <uk...@gmail.com> on 2015/08/04 04:10:19 UTC, 3 replies.
- Unable to compete with performance of single-threaded Scala application - posted by Philip Weaver <ph...@gmail.com> on 2015/08/04 04:31:31 UTC, 0 replies.
- Safe to write to parquet at the same time? - posted by Philip Weaver <ph...@gmail.com> on 2015/08/04 04:37:16 UTC, 1 replies.
- Re: Spark-Submit error - posted by Guru Medasani <gd...@gmail.com> on 2015/08/04 05:08:29 UTC, 2 replies.
- spark streaming max receiver rate doubts - posted by Shushant Arora <sh...@gmail.com> on 2015/08/04 06:11:20 UTC, 1 replies.
- Repartition question - posted by Naveen Madhire <vm...@umail.iu.edu> on 2015/08/04 06:31:04 UTC, 1 replies.
- Re: Data from PostgreSQL to Spark - posted by Jeetendra Gangele <ga...@gmail.com> on 2015/08/04 07:14:00 UTC, 0 replies.
- Re: Unable to query existing hive table from spark sql 1.3.0 - posted by Ishwardeep Singh <is...@impetus.co.in> on 2015/08/04 07:58:37 UTC, 0 replies.
- Spark SQL support for Hive 0.14 - posted by Ishwardeep Singh <is...@impetus.co.in> on 2015/08/04 08:01:09 UTC, 4 replies.
- Re: Schema evolution in tables - posted by Brandon White <bw...@gmail.com> on 2015/08/04 08:10:51 UTC, 0 replies.
- NoSuchMethodError : org.apache.spark.streaming.scheduler.StreamingListenerBus.start()V - posted by Deepesh Maheshwari <de...@gmail.com> on 2015/08/04 09:07:24 UTC, 1 replies.
- Unable to load native-hadoop library for your platform - posted by Deepesh Maheshwari <de...@gmail.com> on 2015/08/04 09:11:00 UTC, 7 replies.
- Twitter live Streaming - posted by Sadaf <sa...@platalytics.com> on 2015/08/04 10:29:26 UTC, 3 replies.
- Re: Twitter Connector-Spark Streaming - posted by Akhil Das <ak...@sigmoidanalytics.com> on 2015/08/04 11:00:32 UTC, 1 replies.
- Transform MongoDB Aggregation into Spark Job - posted by Deepesh Maheshwari <de...@gmail.com> on 2015/08/04 11:18:04 UTC, 1 replies.
- Total delay per batch in a CSV file - posted by allonsy <lu...@gmail.com> on 2015/08/04 12:58:51 UTC, 1 replies.
- Fwd: Writing streaming data to cassandra creates duplicates - posted by Priya Ch <le...@gmail.com> on 2015/08/04 13:03:04 UTC, 2 replies.
- Re: TFIDF Transformation - posted by Yanbo Liang <yb...@gmail.com> on 2015/08/04 13:03:30 UTC, 2 replies.
- Re: Setting a stage timeout - posted by William Kinney <wi...@gmail.com> on 2015/08/04 13:39:51 UTC, 0 replies.
- Re: Parquet SaveMode.Append Trouble. - posted by Cheng Lian <li...@gmail.com> on 2015/08/04 13:48:26 UTC, 0 replies.
- Delete NA in a dataframe - posted by clark djilo kuissu <dj...@yahoo.fr> on 2015/08/04 14:03:45 UTC, 2 replies.
- Re: Schedule lunchtime today for a free webinar "IoT data ingestion in Spark Streaming using Kaa" 11 a.m. PDT (2 p.m. EDT) - posted by or...@cybervisiontech.com on 2015/08/04 14:12:12 UTC, 0 replies.
- giving offset in spark sql - posted by Hafiz Mujadid <ha...@gmail.com> on 2015/08/04 16:04:30 UTC, 0 replies.
- Installation instruction for Zeppelin - posted by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com on 2015/08/04 17:07:57 UTC, 0 replies.
- No Twitter Input from Kafka to Spark Streaming - posted by narendra <na...@gmail.com> on 2015/08/04 18:45:34 UTC, 3 replies.
- Spark SQL unable to recognize schema name - posted by Mohammed Guller <mo...@glassbeam.com> on 2015/08/04 20:45:19 UTC, 0 replies.
- Re: scheduler delay time - posted by maxdml <ma...@gmail.com> on 2015/08/04 22:14:54 UTC, 0 replies.
- Re: Spark SQL unable to recognize schema name - posted by Ted Yu <yu...@gmail.com> on 2015/08/04 23:22:19 UTC, 0 replies.
- Re: Problem submiting an script .py against an standalone cluster. - posted by Ford Farline <fo...@gmail.com> on 2015/08/04 23:36:01 UTC, 0 replies.
- Poor HDFS Data Locality on Spark-EC2 - posted by Jerry Lam <ch...@gmail.com> on 2015/08/05 00:43:07 UTC, 0 replies.
- Is SPARK-3322 fixed in latest version of Spark? - posted by Jim Green <op...@gmail.com> on 2015/08/05 03:12:00 UTC, 3 replies.
- Turn Off Compression for Textfiles - posted by Brandon White <bw...@gmail.com> on 2015/08/05 04:00:23 UTC, 1 replies.
- Re: Difference between RandomForestModel and RandomForestClassificationModel - posted by Yanbo Liang <yb...@gmail.com> on 2015/08/05 04:12:44 UTC, 0 replies.
- Combining Spark Files with saveAsTextFile - posted by Brandon White <bw...@gmail.com> on 2015/08/05 04:22:41 UTC, 3 replies.
- RE: Combining Spark Files with saveAsTextFile - posted by Mohammed Guller <mo...@glassbeam.com> on 2015/08/05 06:38:51 UTC, 1 replies.
- control the number of reducers for groupby in data frame - posted by "Fang, Mike" <ch...@paypal.com.INVALID> on 2015/08/05 07:47:43 UTC, 2 replies.
- Debugging Spark job in Eclipse - posted by Deepesh Maheshwari <de...@gmail.com> on 2015/08/05 10:42:05 UTC, 1 replies.
- Implementing algorithms in GraphX pregel - posted by Krish <kr...@gmail.com> on 2015/08/05 10:46:10 UTC, 0 replies.
- Fwd: MLLIB MulticlassMetrics Unable to find class key - posted by Hayri Volkan Agun <vo...@gmail.com> on 2015/08/05 11:55:24 UTC, 0 replies.
- Spark Streaming - CheckPointing issue - posted by Sadaf <sa...@platalytics.com> on 2015/08/05 14:34:36 UTC, 1 replies.
- Label based MLLib MulticlassMetrics is buggy - posted by Hayri Volkan Agun <vo...@gmail.com> on 2015/08/05 15:19:43 UTC, 3 replies.
- Best practices to call hiveContext in DataFrame.foreach in executor program or how to have a for loop in driver program - posted by unk1102 <um...@gmail.com> on 2015/08/05 17:37:23 UTC, 0 replies.
- Spark SQL Hive - merge small files - posted by Brandon White <bw...@gmail.com> on 2015/08/05 17:43:29 UTC, 2 replies.
- Memory allocation error with Spark 1.5 - posted by Alexis Seigneurin <as...@ippon.fr> on 2015/08/05 18:25:37 UTC, 2 replies.
- Streaming and calculated-once semantics - posted by Dimitris Kouzis - Loukas <lo...@gmail.com> on 2015/08/05 19:46:58 UTC, 0 replies.
- Starting Spark SQL thrift server from within a streaming app - posted by Daniel Haviv <da...@veracity-group.com> on 2015/08/05 19:57:11 UTC, 3 replies.
- Spark MLib v/s SparkR - posted by praveen S <my...@gmail.com> on 2015/08/05 20:24:24 UTC, 6 replies.
- spark hangs at broadcasting during a filter - posted by AlexG <sw...@gmail.com> on 2015/08/05 20:54:50 UTC, 2 replies.
- Set Job Descriptions for Scala application - posted by Rares Vernica <rv...@gmail.com> on 2015/08/05 21:29:59 UTC, 1 replies.
- HiveContext error - posted by Stefan Panayotov <sp...@msn.com> on 2015/08/05 21:57:39 UTC, 0 replies.
- SparkConf "ignoring" keys - posted by Corey Nolet <cj...@gmail.com> on 2015/08/05 22:40:29 UTC, 0 replies.
- Windows function examples in pyspark - posted by jegordon <jg...@gmail.com> on 2015/08/06 00:42:47 UTC, 0 replies.
- Re: trying to understand yarn-client mode - posted by nir <ni...@gmail.com> on 2015/08/06 00:58:29 UTC, 0 replies.
- Newbie question: can shuffle avoid writing and reading from disk? - posted by Muler <mu...@gmail.com> on 2015/08/06 01:10:57 UTC, 4 replies.
- Pause Spark Streaming reading or sampling streaming data - posted by foobar <he...@fb.com> on 2015/08/06 01:44:33 UTC, 5 replies.
- Very high latency to initialize a DataFrame from partitioned parquet database. - posted by Philip Weaver <ph...@gmail.com> on 2015/08/06 02:26:40 UTC, 11 replies.
- PySpark in Pycharm- unable to connect to remote server - posted by Ashish Dutt <as...@gmail.com> on 2015/08/06 02:45:45 UTC, 0 replies.
- Reliable Streaming Receiver - posted by Sourabh Chandak <so...@gmail.com> on 2015/08/06 02:48:29 UTC, 4 replies.
- How to connect to remote HDFS programmatically to retrieve data, analyse it and then write the data back to HDFS? - posted by Ashish Dutt <as...@gmail.com> on 2015/08/06 03:04:37 UTC, 1 replies.
- How to read gzip data in Spark - Simple question - posted by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com on 2015/08/06 05:12:13 UTC, 8 replies.
- Re: Upgrade of Spark-Streaming application - posted by Shushant Arora <sh...@gmail.com> on 2015/08/06 05:35:34 UTC, 2 replies.
- spark on mesos with docker from private repository - posted by Eyal Fink <ey...@yowza3d.com> on 2015/08/06 07:45:26 UTC, 0 replies.
- subscribe - posted by Franc Carter <fr...@rozettatech.com> on 2015/08/06 07:51:53 UTC, 8 replies.
- spark job not accepting resources from worker - posted by Kushal Chokhani <ku...@enlightedinc.com> on 2015/08/06 08:10:31 UTC, 2 replies.
- Unable to persist RDD to HDFS - posted by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com on 2015/08/06 08:34:22 UTC, 1 replies.
- Talk on Deep dive in Spark Dataframe API - posted by madhu phatak <ph...@gmail.com> on 2015/08/06 08:51:53 UTC, 0 replies.
- Aggregate by timestamp from json message - posted by vchandra <vc...@gmail.com> on 2015/08/06 09:16:29 UTC, 0 replies.
- Multiple Thrift servers on one Spark cluster - posted by Bojan Kostic <bl...@gmail.com> on 2015/08/06 10:35:00 UTC, 1 replies.
- Enum values in custom objects mess up RDD operations - posted by Warfish <se...@gmail.com> on 2015/08/06 10:41:19 UTC, 2 replies.
- Re: Is there any way to support multiple users executing SQL on thrift server? - posted by Ted Yu <yu...@gmail.com> on 2015/08/06 11:20:47 UTC, 0 replies.
- How can I know currently supported functions in Spark SQL - posted by Netwaver <wa...@163.com> on 2015/08/06 11:52:26 UTC, 4 replies.
- Re:Re: Real-time data visualization with Zeppelin - posted by jun <ki...@126.com> on 2015/08/06 12:08:48 UTC, 1 replies.
- how to stop twitter-spark streaming - posted by Sadaf <sa...@platalytics.com> on 2015/08/06 12:57:51 UTC, 0 replies.
- Spark-submit not finding main class and the error reflects different path to jar file than specified - posted by Stephen Boesch <ja...@gmail.com> on 2015/08/06 14:18:21 UTC, 1 replies.
- Is it worth storing in ORC for one time read. And can be replace hive with HBase - posted by venkatesh b <ve...@gmail.com> on 2015/08/06 14:54:22 UTC, 3 replies.
- SparkR -Graphx - posted by smagadi <su...@fico.com> on 2015/08/06 15:21:12 UTC, 1 replies.
- SparkException: Yarn application has already ended - posted by Clint McNeil <cl...@impactradius.com> on 2015/08/06 15:29:22 UTC, 1 replies.
- Temp file missing when training logistic regression - posted by Cat <ca...@dsp.io> on 2015/08/06 17:05:42 UTC, 1 replies.
- How to binarize data in spark - posted by Adamantios Corais <ad...@gmail.com> on 2015/08/06 17:19:14 UTC, 3 replies.
- Terminate streaming app on cluster restart - posted by Alexander Krasheninnikov <a....@corp.badoo.com> on 2015/08/06 17:45:11 UTC, 0 replies.
- Error while using ConcurrentHashMap in Spark Streaming - posted by UMESH CHAUDHARY <um...@gmail.com> on 2015/08/06 17:54:11 UTC, 1 replies.
- RE: Execption while using kryo with broadcast - posted by Shuai Zheng <sz...@gmail.com> on 2015/08/06 17:54:26 UTC, 0 replies.
- Specifying the role when launching an AWS spark cluster using spark_ec2 - posted by SK <sk...@gmail.com> on 2015/08/06 19:27:11 UTC, 2 replies.
- Removing empty partitions before we write to HDFS - posted by gpatcham <gp...@gmail.com> on 2015/08/06 21:02:46 UTC, 2 replies.
- Spark Kinesis Checkpointing/Processing Delay - posted by phibit <ph...@gmail.com> on 2015/08/06 21:08:12 UTC, 3 replies.
- Re: log4j.xml bundled in jar vs log4.properties in spark/conf - posted by mlemay <ml...@gmail.com> on 2015/08/06 22:05:28 UTC, 1 replies.
- log4j custom appender ClassNotFoundException with spark 1.4.1 - posted by mlemay <ml...@gmail.com> on 2015/08/06 22:12:21 UTC, 4 replies.
- Spark-Grid Engine light integration writeup - posted by David Chin <da...@gmail.com> on 2015/08/06 22:28:33 UTC, 0 replies.
- Spark Job Failed (Executor Lost & then FS closed) - posted by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com on 2015/08/06 23:50:00 UTC, 2 replies.
- All masters are unresponsive! Giving up. - posted by Jeff Jones <jj...@adaptivebiotech.com> on 2015/08/07 01:12:52 UTC, 4 replies.
- Spark-submit fails when jar is in HDFS - posted by Alan Braithwaite <al...@cloudflare.com> on 2015/08/07 02:21:47 UTC, 4 replies.
- stopping spark stream app - posted by Shushant Arora <sh...@gmail.com> on 2015/08/07 05:26:20 UTC, 10 replies.
- Out of memory with twitter spark streaming - posted by Pankaj Narang <pa...@gmail.com> on 2015/08/07 07:53:00 UTC, 1 replies.
- StringIndexer + VectorAssembler equivalent to HashingTF? - posted by praveen S <my...@gmail.com> on 2015/08/07 08:55:57 UTC, 1 replies.
- Re: SparkR Supported Types - Please add "bigint" - posted by Davies Liu <da...@databricks.com> on 2015/08/07 09:28:22 UTC, 0 replies.
- JavsSparkContext causes hadoop.ipc.RemoteException error - posted by junliu6 <ju...@iflytek.com> on 2015/08/07 10:01:35 UTC, 0 replies.
- Why use spark.history.fs.logDirectory instead of spark.eventLog.dir - posted by canan chen <cc...@gmail.com> on 2015/08/07 10:20:42 UTC, 4 replies.
- DataFrame column structure change - posted by Rishabh Bhardwaj <rb...@gmail.com> on 2015/08/07 10:36:06 UTC, 3 replies.
- miniBatchFraction for LinearRegressionWithSGD - posted by Gerald Loeffler <ge...@googlemail.com> on 2015/08/07 10:45:40 UTC, 6 replies.
- Spark on YARN - posted by Jem Tucker <je...@gmail.com> on 2015/08/07 10:48:45 UTC, 6 replies.
- SparkR -Graphx Connected components - posted by smagadi <su...@fico.com> on 2015/08/07 11:36:19 UTC, 3 replies.
- How to distribute non-serializable object in transform task or broadcast ? - posted by Hao Ren <in...@gmail.com> on 2015/08/07 11:39:26 UTC, 5 replies.
- automatically determine cluster number - posted by Ziqi Zhang <zi...@sheffield.ac.uk> on 2015/08/07 12:02:48 UTC, 0 replies.
- Spark streaming and session windows - posted by Ankur Chauhan <an...@malloc64.com> on 2015/08/07 12:27:20 UTC, 0 replies.
- Insert operation in Dataframe - posted by "guoqing0629@yahoo.com.hk" <gu...@yahoo.com.hk> on 2015/08/07 13:14:59 UTC, 0 replies.
- Re: Time series forecasting - posted by ploffay <pl...@redhat.com> on 2015/08/07 14:24:30 UTC, 0 replies.
- Issues with Phoenix 4.5 - posted by Nicola Ferraro <ni...@gmail.com> on 2015/08/07 14:27:30 UTC, 0 replies.
- Possible bug: JDBC with Speculative mode launches orphan queries - posted by Sa...@wellsfargo.com on 2015/08/07 15:00:21 UTC, 0 replies.
- Amazon DynamoDB & Spark - posted by Yasemin Kaya <go...@gmail.com> on 2015/08/07 15:08:03 UTC, 2 replies.
- Estimate size of Dataframe programatically - posted by Srikanth <sr...@gmail.com> on 2015/08/07 15:48:15 UTC, 3 replies.
- Issue when rebroadcasting a variable outside of the definition scope - posted by "simone.robutti" <si...@gmail.com> on 2015/08/07 16:07:17 UTC, 1 replies.
- distributing large matrices - posted by iceback <ro...@utah.edu> on 2015/08/07 17:17:18 UTC, 2 replies.
- Spark job workflow engine recommendations - posted by Vikram Kone <vi...@gmail.com> on 2015/08/07 17:43:48 UTC, 14 replies.
- Newbie question: what makes Spark run faster than MapReduce - posted by Muler <mu...@gmail.com> on 2015/08/07 18:13:13 UTC, 2 replies.
- Spark is in-memory processing, how then can Tachyon make Spark faster? - posted by Muler <mu...@gmail.com> on 2015/08/07 18:42:16 UTC, 2 replies.
- How to run start-thrift-server in debug mode? - posted by Benjamin Ross <br...@Lattice-Engines.com> on 2015/08/07 18:50:27 UTC, 1 replies.
- tachyon - posted by "Abhishek R. Singh" <ab...@tetrationanalytics.com> on 2015/08/07 18:56:38 UTC, 3 replies.
- SparkSQL: remove jar added by "add jar " command from dependencies - posted by "Wu, James C." <Ja...@disney.com> on 2015/08/07 19:29:25 UTC, 0 replies.
- [Spark Streaming] Session based windowing like in google dataflow - posted by Ankur Chauhan <an...@malloc64.com> on 2015/08/07 19:48:40 UTC, 1 replies.
- Fwd: [Spark + Hive + EMR + S3] Issue when reading from Hive external table backed on S3 with large amount of small files - posted by Roberto Coluccio <ro...@gmail.com> on 2015/08/07 20:18:33 UTC, 0 replies.
- Spark SQL query AVRO file - posted by java8964 <ja...@hotmail.com> on 2015/08/07 20:30:01 UTC, 4 replies.
- Get bucket details created in shuffle phase - posted by cheez <11...@seecs.edu.pk> on 2015/08/07 20:32:16 UTC, 0 replies.
- Spark master driver UI: How to keep it after process finished? - posted by Sa...@wellsfargo.com on 2015/08/07 21:26:44 UTC, 5 replies.
- SparkSQL: "add jar" blocks all queries - posted by "Wu, James C." <Ja...@disney.com> on 2015/08/07 21:40:39 UTC, 1 replies.
- Problems getting expected results from hbase_inputformat.py - posted by Eric Bless <er...@yahoo.com.INVALID> on 2015/08/07 23:03:38 UTC, 3 replies.
- Fwd: spark config - posted by Bryce Lobdell <lo...@gmail.com> on 2015/08/07 23:08:32 UTC, 3 replies.
- Accessing S3 files with s3n:// - posted by Akshat Aranya <aa...@gmail.com> on 2015/08/07 23:42:18 UTC, 5 replies.
- How to get total CPU consumption for Spark job - posted by Xiao JIANG <ji...@outlook.com> on 2015/08/08 00:06:02 UTC, 1 replies.
- Spark failed while trying to read parquet files - posted by Jerrick Hoang <je...@gmail.com> on 2015/08/08 00:20:44 UTC, 3 replies.
- does dstream.transform() run on the driver node? - posted by lookfwd <lo...@gmail.com> on 2015/08/08 01:07:43 UTC, 0 replies.
- Checkpoint Dir Error in Yarn - posted by Mohit Anchlia <mo...@gmail.com> on 2015/08/08 02:48:42 UTC, 1 replies.
- using Spark or pig group by efficient in my use case? - posted by linlma <li...@gmail.com> on 2015/08/08 04:55:47 UTC, 2 replies.
- Spark Maven Build - posted by Benyi Wang <be...@gmail.com> on 2015/08/08 05:45:17 UTC, 1 replies.
- java.lang.ClassNotFoundException - posted by Yasemin Kaya <go...@gmail.com> on 2015/08/08 09:00:51 UTC, 2 replies.
- Pagination on big table, splitting joins - posted by Gaspar Muñoz <gm...@stratio.com> on 2015/08/08 12:58:52 UTC, 1 replies.
- Spark sql jobs n their partition - posted by Raghavendra Pandey <ra...@gmail.com> on 2015/08/08 22:34:16 UTC, 0 replies.
- How to create DataFrame from a binary file? - posted by unk1102 <um...@gmail.com> on 2015/08/08 22:42:15 UTC, 5 replies.
- Re: Schema change on Spark Hive (Parquet file format) table not working - posted by sim <si...@swoop.com> on 2015/08/09 05:47:47 UTC, 0 replies.
- Re: Spark inserting into parquet files with different schema - posted by sim <si...@swoop.com> on 2015/08/09 05:58:25 UTC, 4 replies.
- deleting application files in standalone cluster - posted by Lior Chaga <li...@taboola.com> on 2015/08/09 09:38:31 UTC, 0 replies.
- ERROR ReceiverTracker: Deregistered receiver for stream 0: Stopped by driver - posted by Sadaf <sa...@platalytics.com> on 2015/08/09 09:52:31 UTC, 2 replies.
- Starting a service with Spark Executors - posted by Daniel Haviv <da...@veracity-group.com> on 2015/08/09 11:29:55 UTC, 1 replies.
- Merge metadata error when appending to parquet table - posted by Krzysztof Zarzycki <k....@gmail.com> on 2015/08/09 14:19:02 UTC, 2 replies.
- Questions about SparkSQL join on not equality conditions - posted by gen tang <ge...@gmail.com> on 2015/08/09 15:08:29 UTC, 2 replies.
- stream application map transformation constructor called - posted by Shushant Arora <sh...@gmail.com> on 2015/08/09 18:10:44 UTC, 0 replies.
- intellij14 compiling spark-1.3.1 got error: assertion failed: com.google.protobuf.InvalidProtocalBufferException - posted by "longdanky@163.com" <lo...@163.com> on 2015/08/09 18:41:08 UTC, 3 replies.
- multiple dependency jars using pyspark - posted by Jonathan Haddad <jo...@jonhaddad.com> on 2015/08/09 22:19:09 UTC, 2 replies.
- Error when running pyspark/shell.py to set up iPython notebook - posted by YaoPau <jo...@gmail.com> on 2015/08/10 06:38:31 UTC, 0 replies.
- mllib kmeans produce 1 large and many extremely small clusters - posted by farhan <fa...@hotmail.com> on 2015/08/10 06:58:33 UTC, 1 replies.
- SparkR -Graphx Cliques - posted by smagadi <su...@fico.com> on 2015/08/10 08:13:00 UTC, 0 replies.
- Possible issue for Spark SQL/DataFrame - posted by Netwaver <wa...@163.com> on 2015/08/10 08:36:06 UTC, 3 replies.
- Spark Streaming Restart at scheduled intervals - posted by Pankaj Narang <pa...@gmail.com> on 2015/08/10 10:56:58 UTC, 1 replies.
- question about spark streaming - posted by sequoiadb <ma...@sequoiadb.com> on 2015/08/10 12:24:00 UTC, 1 replies.
- Kinesis records are merged with out obvious way of separating them - posted by raam <ra...@apester.com> on 2015/08/10 12:31:14 UTC, 0 replies.
- Differents of loading data - posted by 李铖 <li...@gmail.com> on 2015/08/10 13:01:00 UTC, 1 replies.
- Spark with GCS Connector - Rate limit error - posted by Oren Shpigel <or...@yowza3d.com> on 2015/08/10 13:09:52 UTC, 1 replies.
- How to connect to spark remotely from java - posted by Zsombor Egyed <eg...@starschema.net> on 2015/08/10 13:44:05 UTC, 2 replies.
- EC2 cluster doesn't work saveAsTextFile - posted by Yasemin Kaya <go...@gmail.com> on 2015/08/10 14:08:36 UTC, 3 replies.
- How to programmatically create, submit and report on Spark jobs? - posted by mark <ma...@googlemail.com> on 2015/08/10 14:12:11 UTC, 2 replies.
- Spark Cassandra Connector issue - posted by satish chandra j <js...@gmail.com> on 2015/08/10 14:44:40 UTC, 5 replies.
- spark-kafka directAPI vs receivers based API - posted by Mohit Durgapal <du...@gmail.com> on 2015/08/10 14:51:06 UTC, 1 replies.
- spark vs flink low memory available - posted by Pa Rö <pa...@googlemail.com> on 2015/08/10 15:59:04 UTC, 5 replies.
- Spark Streaming dealing with broken files without dying - posted by Mario Pastorelli <ma...@teralytics.ch> on 2015/08/10 16:14:52 UTC, 1 replies.
- ClosureCleaner does not work for java code - posted by Hao Ren <in...@gmail.com> on 2015/08/10 17:32:25 UTC, 1 replies.
- How to fix OutOfMemoryError: GC overhead limit exceeded when using Spark Streaming checkpointing - posted by Dmitry Goldenberg <dg...@gmail.com> on 2015/08/10 17:57:31 UTC, 10 replies.
- Problem with take vs. takeSample in PySpark - posted by David Montague <da...@gmail.com> on 2015/08/10 18:49:34 UTC, 1 replies.
- Streaming of WordCount example - posted by Mohit Anchlia <mo...@gmail.com> on 2015/08/10 19:29:14 UTC, 8 replies.
- Kafka direct approach: blockInterval and topic partitions - posted by allonsy <lu...@gmail.com> on 2015/08/10 19:52:03 UTC, 2 replies.
- How to use custom Hadoop InputFormat in DataFrame? - posted by unk1102 <um...@gmail.com> on 2015/08/10 20:22:24 UTC, 2 replies.
- Re: Graceful shutdown for Spark Streaming - posted by Michal Čizmazia <mi...@gmail.com> on 2015/08/10 21:12:58 UTC, 1 replies.
- Fw: Your Application has been Received - posted by Shing Hing Man <ma...@yahoo.com.INVALID> on 2015/08/10 21:20:53 UTC, 0 replies.
- Java Streaming Context - File Stream use - posted by Ashish Soni <as...@gmail.com> on 2015/08/10 21:40:37 UTC, 1 replies.
- Is there any external dependencies for lag() and lead() when using data frames? - posted by Jerry <je...@gmail.com> on 2015/08/10 22:26:23 UTC, 5 replies.
- Optimal way to implement a small lookup table for identifiers in an RDD - posted by Mike Trienis <mi...@orcsol.com> on 2015/08/10 23:13:36 UTC, 0 replies.
- When will window .... - posted by Martin Senne <ma...@martin-senne.de> on 2015/08/10 23:15:19 UTC, 0 replies.
- avoid duplicate due to executor failure in spark stream - posted by Shushant Arora <sh...@gmail.com> on 2015/08/10 23:32:13 UTC, 3 replies.
- collect() works, take() returns "ImportError: No module named iter" - posted by YaoPau <jo...@gmail.com> on 2015/08/10 23:53:51 UTC, 4 replies.
- Re: Do I really need to build Spark for Hive/Thrift Server support? - posted by roni <ro...@gmail.com> on 2015/08/11 00:33:42 UTC, 0 replies.
- Random Forest and StringIndexer in pyspark ML Pipeline - posted by pkphlam <pk...@gmail.com> on 2015/08/11 00:56:06 UTC, 1 replies.
- Re: can't start master node on a standalone environment - posted by pradyumnad <pr...@gmail.com> on 2015/08/11 01:13:00 UTC, 0 replies.
- Re: Json parsing library for Spark Streaming? - posted by pradyumnad <pr...@gmail.com> on 2015/08/11 01:38:43 UTC, 0 replies.
- Inquery about contributing codes - posted by Hyukjin Kwon <gu...@gmail.com> on 2015/08/11 05:02:37 UTC, 1 replies.
- Differents in loading data using spark datasource api and using jdbc - posted by 李铖 <li...@gmail.com> on 2015/08/11 05:23:27 UTC, 1 replies.
- Writing a DataFrame as compressed JSON - posted by sim <si...@swoop.com> on 2015/08/11 06:12:13 UTC, 0 replies.
- Error while output JavaDStream to disk and mongodb - posted by Deepesh Maheshwari <de...@gmail.com> on 2015/08/11 08:14:24 UTC, 0 replies.
- Refresh table - posted by Jerrick Hoang <je...@gmail.com> on 2015/08/11 08:14:54 UTC, 1 replies.
- Re: Wish for 1.4: upper bound on # tasks in Mesos - posted by Haripriya Ayyalasomayajula <ah...@gmail.com> on 2015/08/11 08:26:36 UTC, 1 replies.
- Re: Controlling number of executors on Mesos vs YARN - posted by Haripriya Ayyalasomayajula <ah...@gmail.com> on 2015/08/11 08:38:13 UTC, 7 replies.
- Re: 答复: Package Release Annoucement: Spark SQL on HBase "Astro" - posted by Ted Yu <yu...@gmail.com> on 2015/08/11 09:27:42 UTC, 0 replies.
- Fwd: How to minimize shuffling on Spark dataframe Join? - posted by Abdullah Anwar <ab...@gmail.com> on 2015/08/11 10:44:50 UTC, 3 replies.
- Python3 Spark execution problems - posted by Javier Domingo Cansino <ja...@fon.com> on 2015/08/11 11:02:45 UTC, 1 replies.
- AW: Spark GraphX memory requirements + java.lang.OutOfMemoryError: GC overhead limit exceeded - posted by re...@nzz.ch on 2015/08/11 11:39:46 UTC, 0 replies.
- dse spark-submit multiple jars issue - posted by satish chandra j <js...@gmail.com> on 2015/08/11 12:29:17 UTC, 6 replies.
- How to specify column type when saving DataFrame as parquet file? - posted by Jyun-Fan Tsai <jf...@appier.com> on 2015/08/11 12:58:17 UTC, 2 replies.
- Do you have any other method to get cpu elapsed time of an spark application - posted by JoneZhang <jo...@gmail.com> on 2015/08/11 14:34:55 UTC, 0 replies.
- mllib on (key, Iterable[Vector]) - posted by Fabian Böhnlein <fa...@gmail.com> on 2015/08/11 14:43:29 UTC, 1 replies.
- PySpark order-only window function issue - posted by Maciej Szymkiewicz <ms...@gmail.com> on 2015/08/11 15:41:20 UTC, 1 replies.
- Spark runs into an Infinite loop even if the tasks are completed successfully - posted by Akhil Das <ak...@sigmoidanalytics.com> on 2015/08/11 15:59:07 UTC, 6 replies.
- Parquet without hadoop: Possible? - posted by Sa...@wellsfargo.com on 2015/08/11 16:28:57 UTC, 6 replies.
- Unsupported major.minor version 51.0 - posted by "Yakubovich, Alexey" <Al...@searshc.com> on 2015/08/11 16:55:01 UTC, 3 replies.
- Spark DataFrames uses too many partition - posted by Al M <al...@gmail.com> on 2015/08/11 17:31:05 UTC, 5 replies.
- Application failed error - posted by Anubhav Agarwal <an...@gmail.com> on 2015/08/11 19:07:06 UTC, 0 replies.
- Does print/event logging affect performance? - posted by Sa...@wellsfargo.com on 2015/08/11 21:24:59 UTC, 1 replies.
- Spark Job Hangs on our production cluster - posted by java8964 <ja...@hotmail.com> on 2015/08/11 22:19:05 UTC, 8 replies.
- ClassNotFound spark streaming - posted by Mohit Anchlia <mo...@gmail.com> on 2015/08/11 22:52:52 UTC, 3 replies.
- Job is Failing automatically - posted by Nikhil Gs <gs...@gmail.com> on 2015/08/11 23:07:46 UTC, 3 replies.
- Boosting spark.yarn.executor.memoryOverhead - posted by Eric Bless <er...@yahoo.com.INVALID> on 2015/08/11 23:40:38 UTC, 1 replies.
- Sporadic "Input validation failed" error when executing LogisticRegressionWithLBFGS.train - posted by Francis Lau <fr...@smartsheet.com> on 2015/08/11 23:56:42 UTC, 1 replies.
- grouping by a partitioned key - posted by Philip Weaver <ph...@gmail.com> on 2015/08/12 00:19:13 UTC, 5 replies.
- adding a custom Scala RDD for use in PySpark - posted by Eric Walker <er...@node.io> on 2015/08/12 00:20:33 UTC, 0 replies.
- Re: 答复: 答复: Package Release Annoucement: Spark SQL on HBase "Astro" - posted by Ted Yu <yu...@gmail.com> on 2015/08/12 01:01:43 UTC, 0 replies.
- Partitioning in spark streaming - posted by Mohit Anchlia <mo...@gmail.com> on 2015/08/12 01:06:33 UTC, 5 replies.
- Spark 1.4.0 Docker Slave GPU Access - posted by "Nastooh Avessta (navesta)" <na...@cisco.com> on 2015/08/12 01:36:08 UTC, 1 replies.
- What is the optimal approach to do Secondary Sort in Spark? - posted by swetha <sw...@gmail.com> on 2015/08/12 02:36:55 UTC, 1 replies.
- pregel graphx job not finishing - posted by dizzy5112 <da...@gmail.com> on 2015/08/12 04:23:27 UTC, 0 replies.
- Exception in spark - posted by Ravisankar Mani <rr...@gmail.com> on 2015/08/12 05:50:38 UTC, 4 replies.
- Error when running SparkPi in Intellij - posted by canan chen <cc...@gmail.com> on 2015/08/12 05:56:40 UTC, 0 replies.
- Not seeing Log messages - posted by Spark Enthusiast <sp...@yahoo.in> on 2015/08/12 06:53:53 UTC, 1 replies.
- How to Handle Update Operation from Spark to MongoDB - posted by Deepesh Maheshwari <de...@gmail.com> on 2015/08/12 10:21:22 UTC, 0 replies.
- Parquet file organisation for 100GB+ dataframes - posted by Ewan Leith <ew...@realitymine.com> on 2015/08/12 12:28:12 UTC, 0 replies.
- Is there any tool that i can prove to customer that spark is faster then hive ? - posted by Ladle <la...@tcs.com> on 2015/08/12 12:28:41 UTC, 2 replies.
- make-distribution.sh failing at spark/R/lib/sparkr.zip - posted by MEETHU MATHEW <me...@yahoo.co.in> on 2015/08/12 12:31:59 UTC, 2 replies.
- What is the Effect of Serialization within Stages? - posted by Mark Heimann <ma...@kard.info> on 2015/08/12 12:32:46 UTC, 3 replies.
- Spark 1.4.1 py4j.Py4JException: Method read([]) does not exist - posted by resonance <ma...@live.com> on 2015/08/12 14:45:48 UTC, 0 replies.
- Does Spark optimization might miss to run transformation? - posted by Eugene Morozov <fa...@list.ru> on 2015/08/12 16:06:42 UTC, 1 replies.
- Spark 1.2.2 build problem with Hive 0.12, bringing in wrong version of avro-mapred - posted by java8964 <ja...@hotmail.com> on 2015/08/12 16:36:35 UTC, 0 replies.
- Error writing to cassandra table using spark application - posted by "Nupur Kumar (BLOOMBERG/ 731 LEX)" <nk...@bloomberg.net> on 2015/08/12 18:23:19 UTC, 1 replies.
- UnknownHostNameException looking up host name with > 64 characters - posted by Jeff Jones <jj...@adaptivebiotech.com> on 2015/08/12 20:45:08 UTC, 0 replies.
- spark's behavior about failed tasks - posted by freedafeng <fr...@yahoo.com> on 2015/08/12 22:22:19 UTC, 0 replies.
- Spark - Standalone Vs YARN Vs Mesos - posted by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com on 2015/08/12 23:16:30 UTC, 3 replies.
- Spark 1.3 + Parquet: "Skipping data using statistics" - posted by YaoPau <jo...@gmail.com> on 2015/08/13 00:11:55 UTC, 1 replies.
- what is cause of, and how to recover from, unresponsive nodes w/ spark-ec2 script - posted by AlexG <sw...@gmail.com> on 2015/08/13 01:28:26 UTC, 0 replies.
- Unit Testing - posted by Mohit Anchlia <mo...@gmail.com> on 2015/08/13 01:31:51 UTC, 2 replies.
- Re: Sorted Multiple Outputs - posted by Eugene Morozov <fa...@list.ru> on 2015/08/13 02:06:17 UTC, 1 replies.
- Spark Streaming failing on YARN Cluster - posted by Ramkumar V <ra...@gmail.com> on 2015/08/13 08:50:51 UTC, 6 replies.
- serialization issue - posted by 周千昊 <qh...@apache.org> on 2015/08/13 11:57:25 UTC, 1 replies.
- Spark Streaming: Change Kafka topics on runtime - posted by Nisrina Luthfiyati <ni...@gmail.com> on 2015/08/13 12:38:19 UTC, 3 replies.
- About Databricks's spark-sql-perf - posted by Todd <bi...@163.com> on 2015/08/13 15:49:07 UTC, 3 replies.
- 回复:Spark DataFrames uses too many partition - posted by prosp4300 <pr...@163.com> on 2015/08/13 15:54:55 UTC, 0 replies.
- Re: Question regarding join with multiple columns with pyspark - posted by Dan LaBar <da...@gmail.com> on 2015/08/13 16:10:16 UTC, 0 replies.
- Eviction of RDD persisted on disk - posted by Eugene Morozov <fa...@list.ru> on 2015/08/13 16:15:29 UTC, 0 replies.
- Re: saveToCassandra not working in Spark Job but works in Spark Shell - posted by satish chandra j <js...@gmail.com> on 2015/08/13 16:29:25 UTC, 2 replies.
- Create column in nested structure? - posted by Ewan Leith <ew...@realitymine.com> on 2015/08/13 16:44:10 UTC, 1 replies.
- spark.streaming.maxRatePerPartition parameter: what are the benefits? - posted by allonsy <lu...@gmail.com> on 2015/08/13 16:50:29 UTC, 1 replies.
- Streaming on Exponential Data - posted by UMESH CHAUDHARY <um...@gmail.com> on 2015/08/13 17:55:30 UTC, 1 replies.
- Spark 1.3.0: ExecutorLostFailure depending on input file size - posted by "Wyss Michael (wysm)" <wy...@zhaw.ch> on 2015/08/13 19:33:03 UTC, 0 replies.
- Spark RuntimeException hadoop output format - posted by Mohit Anchlia <mo...@gmail.com> on 2015/08/13 19:49:28 UTC, 6 replies.
- Write to cassandra...each individual statement - posted by Priya Ch <le...@gmail.com> on 2015/08/13 20:30:17 UTC, 3 replies.
- Fwd: - Spark 1.4.1 - run-example SparkPi - Failure ... - posted by Naga Vij <nv...@gmail.com> on 2015/08/13 20:46:05 UTC, 0 replies.
- MatrixFactorizationModel.save got StackOverflowError - posted by Benyi Wang <be...@gmail.com> on 2015/08/13 21:02:00 UTC, 0 replies.
- RDD.join vs spark SQL join - posted by Xiao JIANG <ji...@outlook.com> on 2015/08/13 21:55:07 UTC, 2 replies.
- New Spark User - GBM iterations and Spark benchmarks - posted by "Sereday, Scott" <Sc...@nielsen.com> on 2015/08/13 22:24:17 UTC, 0 replies.
- Input size increasing every iteration of gradient boosted trees [1.4] - posted by Matt Forbes <mf...@twitter.com.INVALID> on 2015/08/13 23:04:44 UTC, 2 replies.
- Retrieving offsets from previous spark streaming checkpoint - posted by Stephen Durfey <sj...@gmail.com> on 2015/08/13 23:53:02 UTC, 1 replies.
- Custom serialization and checkpointing - posted by Tech Meme <ev...@gmail.com> on 2015/08/14 01:32:44 UTC, 0 replies.
- graphx class not found error - posted by dizzy5112 <da...@gmail.com> on 2015/08/14 03:17:03 UTC, 2 replies.
- Re: ERROR Executor java.lang.NoClassDefFoundError - posted by nsalian <ne...@gmail.com> on 2015/08/14 04:08:55 UTC, 0 replies.
- Reduce number of partitions before saving to file. coalesce or repartition? - posted by Alexander Pivovarov <ap...@gmail.com> on 2015/08/14 04:56:45 UTC, 1 replies.
- Always two tasks slower than others, and then job fails - posted by randylu <ra...@gmail.com> on 2015/08/14 05:01:29 UTC, 2 replies.
- spark streaming map use external variable occur a problem - posted by kale <80...@qq.com> on 2015/08/14 05:02:42 UTC, 1 replies.
- Materials for deep insight into Spark SQL - posted by Todd <bi...@163.com> on 2015/08/14 05:54:23 UTC, 2 replies.
- matrix inverse and multiplication - posted by go canal <go...@yahoo.com.INVALID> on 2015/08/14 05:59:37 UTC, 1 replies.
- worker and executor memory - posted by James Pirz <ja...@gmail.com> on 2015/08/14 06:10:45 UTC, 1 replies.
- how do you convert directstream into data frames - posted by Mohit Durgapal <du...@gmail.com> on 2015/08/14 06:41:35 UTC, 1 replies.
- Driver staggering task launch times - posted by Ara Vartanian <ar...@cs.wisc.edu> on 2015/08/14 07:13:04 UTC, 3 replies.
- Using unserializable classes in tasks - posted by mark <ma...@googlemail.com> on 2015/08/14 09:05:45 UTC, 3 replies.
- Re: spark.files.userClassPathFirst=true Return Error - Please help - posted by Kyle Lin <ky...@gmail.com> on 2015/08/14 10:35:21 UTC, 2 replies.
- Spark job endup with NPE - posted by hide <x2...@gmail.com> on 2015/08/14 13:05:22 UTC, 1 replies.
- Error: Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/HBaseConfiguration - posted by stelsavva <st...@avocarrot.com> on 2015/08/14 13:59:14 UTC, 2 replies.
- Left outer joining big data set with small lookups - posted by VIJAYAKUMAR JAWAHARLAL <sp...@data2o.io> on 2015/08/14 15:39:32 UTC, 5 replies.
- Cannot cast to Tuple when running in cluster mode - posted by Sa...@wellsfargo.com on 2015/08/14 17:34:48 UTC, 1 replies.
- Re: Maintaining Kafka Direct API Offsets - posted by dutrow <da...@gmail.com> on 2015/08/14 17:36:42 UTC, 5 replies.
- Another issue with using lag and lead with data frames - posted by Jerry <je...@gmail.com> on 2015/08/14 18:50:59 UTC, 3 replies.
- Fwd: Graphx - how to add vertices to a HashSet of vertices ? - posted by Ranjana Rajendran <ra...@gmail.com> on 2015/08/14 20:04:46 UTC, 0 replies.
- Re: Checkpointing doesn't appear to be working for direct streaming from Kafka - posted by Dmitry Goldenberg <dg...@gmail.com> on 2015/08/14 21:31:25 UTC, 2 replies.
- QueueStream Does Not Support Checkpointing - posted by Asim Jalis <as...@gmail.com> on 2015/08/14 22:04:30 UTC, 4 replies.
- Help with persist: Data is requested again - posted by Sa...@wellsfargo.com on 2015/08/14 23:11:57 UTC, 1 replies.
- Re: Setting up Spark/flume/? to Ingest 10TB from FTP - posted by "Varadhan, Jawahar" <va...@yahoo.com.INVALID> on 2015/08/14 23:11:58 UTC, 2 replies.
- Executors on multiple nodes - posted by Mohit Anchlia <mo...@gmail.com> on 2015/08/15 01:40:51 UTC, 1 replies.
- Too many files/dirs in hdfs - posted by Mohit Anchlia <mo...@gmail.com> on 2015/08/15 01:50:50 UTC, 7 replies.
- How to save a string to a text file ? - posted by go canal <go...@yahoo.com.INVALID> on 2015/08/15 04:50:19 UTC, 2 replies.
- How to run spark in standalone mode on cassandra with high availability? - posted by Vikram Kone <vi...@gmail.com> on 2015/08/15 09:33:22 UTC, 1 replies.
- Can't find directory after resetting REPL state - posted by Kevin Jung <it...@samsung.com> on 2015/08/15 10:03:42 UTC, 2 replies.
- Difference between Sort based and Hash based shuffle - posted by Muhammad Haseeb Javed <11...@seecs.edu.pk> on 2015/08/15 22:42:33 UTC, 5 replies.
- Can't understand the size of raw RDD and its DataFrame - posted by Todd <bi...@163.com> on 2015/08/16 03:35:08 UTC, 3 replies.
- spark on yarn is slower than spark-ec2 standalone, how to tune? - posted by AlexG <sw...@gmail.com> on 2015/08/16 03:43:40 UTC, 0 replies.
- TestSQLContext compilation error when run SparkPi in Intellij ? - posted by canan chen <cc...@gmail.com> on 2015/08/16 04:40:47 UTC, 3 replies.
- Spark hangs on collect (stuck on scheduler delay) - posted by Sagi r <st...@gmail.com> on 2015/08/16 09:44:44 UTC, 1 replies.
- Apache Spark - Parallel Processing of messages from Kafka - Java - posted by mohanaugust <mo...@gmail.com> on 2015/08/16 11:49:09 UTC, 2 replies.
- Spark can't fetch application jar after adding it to HTTP server - posted by t4ng0 <ma...@gmail.com> on 2015/08/16 14:47:35 UTC, 1 replies.
- Re: How to submit an application using spark-submit - posted by t4ng0 <ma...@gmail.com> on 2015/08/16 18:11:28 UTC, 0 replies.
- Spark cant fetch the added jar to http server - posted by t4ng0 <ma...@gmail.com> on 2015/08/16 19:11:15 UTC, 0 replies.
- Spark on scala 2.11 build fails due to incorrect jline dependency in REPL - posted by Stephen Boesch <ja...@gmail.com> on 2015/08/16 20:12:23 UTC, 2 replies.
- SparkPi is geting java.lang.NoClassDefFoundError: scala/collection/Seq - posted by xiaohe lan <zo...@gmail.com> on 2015/08/16 20:14:36 UTC, 2 replies.
- Spark executor lost because of time out even after setting quite long time out value 1000 seconds - posted by unk1102 <um...@gmail.com> on 2015/08/16 20:15:19 UTC, 1 replies.
- Example code to spawn multiple threads in driver program - posted by unk1102 <um...@gmail.com> on 2015/08/16 20:56:25 UTC, 0 replies.
- Re: Spark Master HA on YARN - posted by Ruslan Dautkhanov <da...@gmail.com> on 2015/08/16 22:59:48 UTC, 1 replies.
- How should I do to solve this problem that the executors of my spark application always is blocked after an executor is lost? - posted by 刚 <94...@qq.com> on 2015/08/17 05:35:46 UTC, 0 replies.
- Understanding the two jobs run with spark sql join - posted by Todd <bi...@163.com> on 2015/08/17 06:09:14 UTC, 0 replies.
- Re: How should I do to solve this problem that the executors of my spark application always is blocked after an executor is lost? - posted by Saisai Shao <sa...@gmail.com> on 2015/08/17 06:47:44 UTC, 0 replies.
- grpah x issue spark 1.3 - posted by dizzy5112 <da...@gmail.com> on 2015/08/17 07:19:05 UTC, 2 replies.
- 回复: How should I do to solve this problem that the executors of myspark application always is blocked after an executor is lost? - posted by 刚 <94...@qq.com> on 2015/08/17 07:27:59 UTC, 0 replies.
- Re: How should I do to solve this problem that the executors of myspark application always is blocked after an executor is lost? - posted by Saisai Shao <sa...@gmail.com> on 2015/08/17 07:47:07 UTC, 0 replies.
- Subscribe - posted by Rishitesh Mishra <ri...@gmail.com> on 2015/08/17 08:23:39 UTC, 0 replies.
- Programmatically create SparkContext on YARN - posted by Andreas Fritzler <an...@gmail.com> on 2015/08/17 09:34:17 UTC, 2 replies.
- 回复: How should I do to solve this problem that the executors ofmyspark application always is blocked after an executor is lost? - posted by 刚 <94...@qq.com> on 2015/08/17 10:34:38 UTC, 0 replies.
- Re: How should I do to solve this problem that the executors ofmyspark application always is blocked after an executor is lost? - posted by Saisai Shao <sa...@gmail.com> on 2015/08/17 10:51:30 UTC, 1 replies.
- S3n, parallelism, partitions - posted by matd <ma...@gmail.com> on 2015/08/17 10:59:39 UTC, 2 replies.
- Meaning of local[2] - posted by praveen S <my...@gmail.com> on 2015/08/17 12:34:57 UTC, 2 replies.
- Transform KafkaRDD to KafkaRDD, not plain RDD, or how to keep OffsetRanges after transformation - posted by Petr Novak <os...@gmail.com> on 2015/08/17 13:08:34 UTC, 5 replies.
- Re: spark streaming 1.3 doubts(force it to not consume anything) - posted by Shushant Arora <sh...@gmail.com> on 2015/08/17 13:13:35 UTC, 10 replies.
- Paper on Spark SQL - posted by Todd <bi...@163.com> on 2015/08/17 14:31:35 UTC, 4 replies.
- Serializing MLlib MatrixFactorizationModel - posted by Madawa Soysa <ma...@cse.mrt.ac.lk> on 2015/08/17 15:31:05 UTC, 2 replies.
- Re: Spark Interview Questions - posted by Sandeep Giri <sa...@knowbigdata.com> on 2015/08/17 15:40:38 UTC, 1 replies.
- rdd count is throwing null pointer exception - posted by Priya Ch <le...@gmail.com> on 2015/08/17 16:26:23 UTC, 3 replies.
- [survey] [spark-ec2] What do you like/dislike about spark-ec2? - posted by Nicholas Chammas <ni...@gmail.com> on 2015/08/17 17:09:58 UTC, 4 replies.
- What's the logic in RangePartitioner.rangeBounds method of Apache Spark - posted by ihainan <ih...@gmail.com> on 2015/08/17 18:13:27 UTC, 0 replies.
- Calling hiveContext.sql("insert into table xyz...") in multiple threads? - posted by unk1102 <um...@gmail.com> on 2015/08/17 18:39:37 UTC, 0 replies.
- Embarassingly parallel computation in SparkR? - posted by Kristina Rogale Plazonic <kp...@gmail.com> on 2015/08/17 21:47:54 UTC, 0 replies.
- Re: issue Running Spark Job on Yarn Cluster - posted by poolis <gm...@gmail.com> on 2015/08/17 21:50:33 UTC, 2 replies.
- registering an empty RDD as a temp table in a PySpark SQL context - posted by Eric Walker <er...@node.io> on 2015/08/17 21:53:22 UTC, 1 replies.
- Python's ReduceByKeyAndWindow DStream Keeps Growing - posted by Asim Jalis <as...@gmail.com> on 2015/08/17 23:47:54 UTC, 0 replies.
- how do I execute a job on a single worker node in standalone mode - posted by Axel Dahl <ax...@whisperstream.com> on 2015/08/18 00:36:13 UTC, 5 replies.
- Spark 1.4.1 - Mac OSX Yosemite - posted by Alun Champion <al...@achampion.net> on 2015/08/18 02:36:38 UTC, 3 replies.
- java.lang.IllegalAccessError: class com.google.protobuf.HBaseZeroCopyByteString cannot access its superclass com.google.protobuf.LiteralByteString - posted by stark_summer <st...@qq.com> on 2015/08/18 04:51:30 UTC, 2 replies.
- 回复: How should I do to solve this problem that the executorsofmyspark application always is blocked after an executor is lost? - posted by 刚 <94...@qq.com> on 2015/08/18 04:54:00 UTC, 0 replies.
- Re: How should I do to solve this problem that the executorsofmyspark application always is blocked after an executor is lost? - posted by Saisai Shao <sa...@gmail.com> on 2015/08/18 05:13:22 UTC, 0 replies.
- Exception when S3 path contains colons - posted by Brian Stempin <br...@gmail.com> on 2015/08/18 06:50:32 UTC, 3 replies.
- Regarding rdd.collect() - posted by praveen S <my...@gmail.com> on 2015/08/18 08:32:01 UTC, 6 replies.
- Difference btw MEMORY_ONLY and MEMORY_AND_DISK - posted by Harsha HN <99...@gmail.com> on 2015/08/18 09:15:56 UTC, 1 replies.
- 回复: How should I do to solve this problem that the executorsofmysparkapplication always is blocked after an executor is lost? - posted by 刚 <94...@qq.com> on 2015/08/18 09:18:06 UTC, 0 replies.
- Changed Column order in DataFrame.Columns call and insertIntoJDBC - posted by MooseSpark <Pa...@gmail.com> on 2015/08/18 09:40:54 UTC, 1 replies.
- Re: Running Spark on user-provided Hadoop installation - posted by gauravsehgal <ga...@gmail.com> on 2015/08/18 10:34:45 UTC, 0 replies.
- Spark SQL Partition discovery - schema evolution - posted by Guy Hadash <GU...@il.ibm.com> on 2015/08/18 10:39:52 UTC, 0 replies.
- global variable in spark streaming with no dependency on key - posted by Joanne Contact <jo...@gmail.com> on 2015/08/18 10:57:15 UTC, 2 replies.
- Re: How should I do to solve this problem that the executorsofmysparkapplication always is blocked after an executor is lost? - posted by Saisai Shao <sa...@gmail.com> on 2015/08/18 11:23:02 UTC, 0 replies.
- Why standalone mode don't allow to set num-executor ? - posted by canan chen <cc...@gmail.com> on 2015/08/18 11:35:04 UTC, 1 replies.
- Why there are overlapping for tasks on the EventTimeline UI - posted by Todd <bi...@163.com> on 2015/08/18 11:40:07 UTC, 1 replies.
- 回复: How should I do to solve this problem that theexecutorsofmysparkapplication always is blocked after an executor is lost? - posted by 刚 <94...@qq.com> on 2015/08/18 12:22:28 UTC, 0 replies.
- Is it this a BUG?: Why Spark Flume Streaming job is not deploying the Receiver to the specified host? - posted by diplomatic Guru <di...@gmail.com> on 2015/08/18 13:45:31 UTC, 3 replies.
- Re: how to write any data (non RDD) to a file inside closure? - posted by Robineast <Ro...@xense.co.uk> on 2015/08/18 14:41:36 UTC, 0 replies.
- Spark + Jupyter (IPython Notebook) - posted by Jerry Lam <ch...@gmail.com> on 2015/08/18 15:35:31 UTC, 6 replies.
- Java 8 lambdas - posted by Kristoffer Sjögren <st...@gmail.com> on 2015/08/18 16:23:32 UTC, 1 replies.
- Strange shuffle behaviour difference between Zeppelin and Spark-shell - posted by Rick Moritz <ra...@gmail.com> on 2015/08/18 16:38:26 UTC, 6 replies.
- Spark works with the data in another cluster(Elasticsearch) - posted by gen tang <ge...@gmail.com> on 2015/08/18 16:39:44 UTC, 3 replies.
- Issues with S3 paths that contain colons - posted by bstempi <br...@gmail.com> on 2015/08/18 17:20:22 UTC, 2 replies.
- Spark and ActorSystem - posted by maxdml <ma...@gmail.com> on 2015/08/18 17:25:01 UTC, 0 replies.
- Spark executor lost because of GC overhead limit exceeded even though using 20 executors using 25GB each - posted by unk1102 <um...@gmail.com> on 2015/08/18 17:57:25 UTC, 2 replies.
- What am I missing that's preventing javac from finding the libraries (CLASSPATH is setup...)? - posted by Jerry <je...@gmail.com> on 2015/08/18 19:28:39 UTC, 3 replies.
- COMPUTE STATS on hive table - NoSuchTableException - posted by VIJAYAKUMAR JAWAHARLAL <sp...@data2o.io> on 2015/08/18 20:19:54 UTC, 0 replies.
- Evaluating spark + Cassandra for our use cases - posted by Benjamin Ross <br...@Lattice-Engines.com> on 2015/08/18 21:18:18 UTC, 2 replies.
- Scala: How to match a java object???? - posted by Sa...@wellsfargo.com on 2015/08/18 21:38:32 UTC, 9 replies.
- Json Serde used by Spark Sql - posted by Udit Mehta <um...@groupon.com> on 2015/08/18 22:12:18 UTC, 1 replies.
- NaN in GraphX PageRank answer - posted by Khaled Ammar <kh...@gmail.com> on 2015/08/18 22:41:55 UTC, 0 replies.
- Spark scala addFile retrieving file with incorrect size - posted by Bernardo Vecchia Stein <be...@gmail.com> on 2015/08/18 23:07:13 UTC, 0 replies.
- broadcast variable of Kafka producer throws ConcurrentModificationException - posted by "Shenghua(Daniel) Wan" <wa...@gmail.com> on 2015/08/18 23:55:59 UTC, 7 replies.
- What is the reason for ExecutorLostFailure? - posted by VIJAYAKUMAR JAWAHARLAL <sp...@data2o.io> on 2015/08/19 00:26:19 UTC, 2 replies.
- [mllib] Random forest maxBins and confidence in training points - posted by Mark Alen <li...@yahoo.com.INVALID> on 2015/08/19 01:54:52 UTC, 0 replies.
- to retrive full stack trace - posted by satish chandra j <js...@gmail.com> on 2015/08/19 04:01:52 UTC, 1 replies.
- Failed to fetch block error - posted by swetha <sw...@gmail.com> on 2015/08/19 07:06:04 UTC, 1 replies.
- SparkR csv without headers - posted by Franc Carter <fr...@rozettatech.com> on 2015/08/19 07:48:00 UTC, 3 replies.
- Repartitioning external table in Spark sql - posted by James Pirz <ja...@gmail.com> on 2015/08/19 07:50:18 UTC, 0 replies.
- SaveAsTable changes the order of rows - posted by Kevin Jung <it...@samsung.com> on 2015/08/19 08:03:15 UTC, 0 replies.
- How to automatically relaunch a Driver program after crashes? - posted by Spark Enthusiast <sp...@yahoo.in> on 2015/08/19 08:55:39 UTC, 4 replies.
- Does spark sql support column indexing - posted by Todd <bi...@163.com> on 2015/08/19 09:21:04 UTC, 1 replies.
- 回复:Does spark sql support column indexing - posted by prosp4300 <pr...@163.com> on 2015/08/19 09:46:23 UTC, 0 replies.
- What's the best practice for developing new features for spark ? - posted by canan chen <cc...@gmail.com> on 2015/08/19 10:44:09 UTC, 3 replies.
- tasks of stage if run same woker - posted by kale <80...@qq.com> on 2015/08/19 11:00:30 UTC, 0 replies.
- Spark UI returning error 500 in yarn-client mode - posted by Moshe Eshel <me...@gmail.com> on 2015/08/19 11:51:45 UTC, 1 replies.
- Spark return key value pair - posted by Jerry OELoo <oy...@gmail.com> on 2015/08/19 13:10:07 UTC, 2 replies.
- How to overwrite partition when writing Parquet? - posted by Romi Kuntsman <ro...@totango.com> on 2015/08/19 16:48:46 UTC, 3 replies.
- Spark Streaming: Some issues (Could not compute split, block —— not found) and questions - posted by jlg <jg...@adzerk.com> on 2015/08/19 16:51:18 UTC, 1 replies.
- ValueError: Can only zip with RDD which has the same number of partitions error on one machine but not on another - posted by Abhinav Mishra <am...@tidemark.com> on 2015/08/19 16:52:44 UTC, 0 replies.
- blogs/articles/videos on how to analyse spark performance - posted by Todd <bi...@163.com> on 2015/08/19 17:12:26 UTC, 2 replies.
- SQLContext Create Table Problem - posted by Yusuf Can Gürkan <yu...@useinsider.com> on 2015/08/19 17:44:56 UTC, 4 replies.
- SQLContext load. Filtering files - posted by Masf <ma...@gmail.com> on 2015/08/19 19:16:17 UTC, 3 replies.
- How to avoid executor time out on yarn spark while dealing with large shuffle skewed data? - posted by unk1102 <um...@gmail.com> on 2015/08/19 20:46:49 UTC, 10 replies.
- spark.sql.shuffle.partitions=1 seems to be working fine but creates timeout for large skewed data - posted by unk1102 <um...@gmail.com> on 2015/08/19 21:13:34 UTC, 3 replies.
- PySpark on Mesos - Scaling - posted by scox <sc...@renci.org> on 2015/08/19 21:39:14 UTC, 0 replies.
- How to set the number of executors and tasks in a Spark Streaming job in Mesos - posted by swetha <sw...@gmail.com> on 2015/08/19 22:58:06 UTC, 1 replies.
- Spark Sql behaves strangely with tables with a lot of partitions - posted by Jerrick Hoang <je...@gmail.com> on 2015/08/20 04:51:50 UTC, 17 replies.
- creating data warehouse with Spark and running query with Hive - posted by Jeetendra Gangele <ga...@gmail.com> on 2015/08/20 07:53:30 UTC, 1 replies.
- How to get the radius of clusters in spark K means - posted by ashensw <as...@wso2.com> on 2015/08/20 08:14:48 UTC, 1 replies.
- Re: "insert overwrite table phonesall" in spark-sql resulted in java.io.StreamCorruptedException - posted by John Jay <zj...@gmail.com> on 2015/08/20 09:16:04 UTC, 0 replies.
- persist for DStream - posted by Deepesh Maheshwari <de...@gmail.com> on 2015/08/20 10:39:13 UTC, 1 replies.
- MLlib Prefixspan implementation - posted by Alexis Gillain <al...@googlemail.com> on 2015/08/20 11:00:26 UTC, 4 replies.
- Re: Out of memory exception in MLlib's naive baye's classification training - posted by minerva <st...@gmail.com> on 2015/08/20 11:25:48 UTC, 0 replies.
- Memory-efficient successive calls to repartition() - posted by abellet <au...@telecom-paristech.fr> on 2015/08/20 11:26:58 UTC, 2 replies.
- SparkSQL concerning materials - posted by Dawid Wysakowicz <wy...@gmail.com> on 2015/08/20 11:46:57 UTC, 6 replies.
- Transformation not happening for reduceByKey or GroupByKey - posted by satish chandra j <js...@gmail.com> on 2015/08/20 12:05:52 UTC, 12 replies.
- [SparkR] How to perform a for loop on a DataFrame object - posted by Florian M <fl...@gmail.com> on 2015/08/20 12:10:11 UTC, 0 replies.
- Convert mllib.linalg.Matrix to Breeze - posted by Naveen <na...@formcept.com> on 2015/08/20 12:24:49 UTC, 4 replies.
- Spark 1.3. Insert into hive parquet partitioned table from DataFrame - posted by Masf <ma...@gmail.com> on 2015/08/20 12:25:14 UTC, 0 replies.
- Data locality with HDFS not being seen - posted by Sunil <sd...@gmail.com> on 2015/08/20 13:09:16 UTC, 1 replies.
- How to add a new column with date duration from 2 date columns in a dataframe - posted by Dhaval Patel <dh...@gmail.com> on 2015/08/20 14:18:34 UTC, 5 replies.
- DataFrameWriter.jdbc is very slow - posted by Aram Mkrtchyan <ar...@gmail.com> on 2015/08/20 14:18:45 UTC, 2 replies.
- DAG related query - posted by Bahubali Jain <ba...@gmail.com> on 2015/08/20 15:15:10 UTC, 2 replies.
- PySpark concurrent jobs using single SparkContext - posted by Mike Sukmanowsky <mi...@gmail.com> on 2015/08/20 15:34:31 UTC, 1 replies.
- Data frame created from hive table and its partition - posted by VIJAYAKUMAR JAWAHARLAL <sp...@data2o.io> on 2015/08/20 16:29:36 UTC, 2 replies.
- How to list all dataframes and RDDs available in current session? - posted by Dhaval Patel <dh...@gmail.com> on 2015/08/20 18:49:18 UTC, 4 replies.
- Windowed stream operations -- These are too lazy for some use cases - posted by Justin Grimes <jg...@adzerk.com> on 2015/08/20 18:58:29 UTC, 2 replies.
- Run scala code with spark submit - posted by MasterSergius <ma...@gmail.com> on 2015/08/20 19:07:04 UTC, 1 replies.
- dataframe json schema scan - posted by Alex Nastetsky <al...@vervemobile.com> on 2015/08/20 21:35:35 UTC, 0 replies.
- FAILED_TO_UNCOMPRESS error from Snappy - posted by Kohki Nishio <ta...@gmail.com> on 2015/08/20 21:49:09 UTC, 1 replies.
- load NULL Values in RDD - posted by "SAHA, DEBOBROTA" <ds...@att.com> on 2015/08/20 21:55:15 UTC, 1 replies.
- Creating Spark DataFrame from large pandas DataFrame - posted by Charlie Hack <ch...@gmail.com> on 2015/08/20 22:08:16 UTC, 2 replies.
- org.apache.hadoop.security.AccessControlException: Permission denied when access S3 - posted by Shuai Zheng <sz...@gmail.com> on 2015/08/20 22:33:05 UTC, 1 replies.
- Re: Saving and loading MLlib models as standalone (no Hadoop) - posted by Robineast <Ro...@xense.co.uk> on 2015/08/20 23:34:42 UTC, 0 replies.
- SparkR - can't create spark context - JVM not ready - posted by Deborah Siegel <de...@gmail.com> on 2015/08/21 00:30:39 UTC, 2 replies.
- Spark SQL window functions (RowsBetween) - posted by Mike Trienis <mi...@orcsol.com> on 2015/08/21 00:32:12 UTC, 0 replies.
- Kafka Spark Partition Mapping - posted by nehalsyed <ne...@cable.comcast.com> on 2015/08/21 00:47:52 UTC, 3 replies.
- Any suggestion about "sendMessageReliably failed because ack was not received within 120 sec" - posted by java8964 <ja...@hotmail.com> on 2015/08/21 02:49:52 UTC, 1 replies.
- Re: what determine the task size? - posted by ambujhbti <ag...@hawk.iit.edu> on 2015/08/21 04:01:45 UTC, 1 replies.
- spark kafka partitioning - posted by Gaurav Agarwal <ga...@gmail.com> on 2015/08/21 04:48:56 UTC, 3 replies.
- Spark-Cassandra-connector - posted by Samya <sa...@amadeus.com> on 2015/08/21 07:57:02 UTC, 1 replies.
- ClassCastException when saving a DataFrame to parquet file (saveAsParquetFile, Spark 1.3.1) using Scala - posted by Emma Boya Peng <u3...@connect.hku.hk> on 2015/08/21 09:15:47 UTC, 1 replies.
- SPARK sql :Need JSON back isntead of roq - posted by smagadi <su...@fico.com> on 2015/08/21 09:59:43 UTC, 3 replies.
- SPARK SQL support for XML - posted by smagadi <su...@fico.com> on 2015/08/21 10:01:05 UTC, 1 replies.
- spark streaming 1.3 kafka error - posted by Shushant Arora <sh...@gmail.com> on 2015/08/21 11:06:08 UTC, 9 replies.
- Worker Machine running out of disk for Long running Streaming process - posted by gaurav sharma <sh...@gmail.com> on 2015/08/21 11:59:47 UTC, 2 replies.
- Spark streaming multi-tasking during I/O - posted by Sateesh Kavuri <sa...@gmail.com> on 2015/08/21 12:36:40 UTC, 7 replies.
- DataFrame. SparkPlan / Project serialization issue: ArrayIndexOutOfBounds. - posted by Eugene Morozov <ev...@gmail.com> on 2015/08/21 12:37:28 UTC, 1 replies.
- Is long running Spark batch job in fine grained mode is Deprecated? - posted by Akash Mishra <ak...@gmail.com> on 2015/08/21 13:22:07 UTC, 0 replies.
- Having Clause with variation and stddev - posted by Ravisankar Mani <rr...@gmail.com> on 2015/08/21 14:08:40 UTC, 0 replies.
- Tungsten and sun.misc.Unsafe - posted by Marek Kolodziej <mk...@gmail.com> on 2015/08/21 14:29:13 UTC, 2 replies.
- Want to install lz4 compression - posted by Sa...@wellsfargo.com on 2015/08/21 15:57:56 UTC, 1 replies.
- build spark 1.4.1 with JDK 1.6 - posted by Chen Song <ch...@gmail.com> on 2015/08/21 16:11:59 UTC, 9 replies.
- Finding the number of executors. - posted by Virgil Palanciuc <vi...@palanciuc.eu> on 2015/08/21 16:42:45 UTC, 4 replies.
- Re: Failed stages and dropped executors when running implicit matrix factorization/ALS - posted by Ravi Mody <rm...@gmail.com> on 2015/08/21 16:44:18 UTC, 0 replies.
- Spark ec2 lunch problem - posted by Garry Chen <gc...@cornell.edu> on 2015/08/21 16:55:59 UTC, 6 replies.
- RE: Remoting warning when submitting to cluster - posted by javidelgadillo <jd...@esri.com> on 2015/08/21 19:17:29 UTC, 0 replies.
- Datediff in minutes - posted by Stefan Panayotov <sp...@msn.com> on 2015/08/21 19:29:04 UTC, 0 replies.
- Re: Aggregate to array (or 'slice by key') with DataFrames - posted by Impact <na...@skone.org> on 2015/08/21 19:45:23 UTC, 4 replies.
- Set custm worker id ? - posted by Sa...@wellsfargo.com on 2015/08/21 20:57:23 UTC, 1 replies.
- How frequently should full gc we expect - posted by java8964 <ja...@hotmail.com> on 2015/08/21 23:14:07 UTC, 1 replies.
- Streaming: BatchTime OffsetRange Mapping? - posted by Susan Zhang <su...@gmail.com> on 2015/08/22 01:01:57 UTC, 0 replies.
- How can I save the RDD result as Orcfile with spark1.3? - posted by "dong.yajun" <do...@gmail.com> on 2015/08/22 04:36:22 UTC, 4 replies.
- sparkStreaming how to work with partitions,how tp create partition - posted by Gaurav Agarwal <ga...@gmail.com> on 2015/08/22 12:39:42 UTC, 1 replies.
- spark 1.4.1 - LZFException - posted by Yadid Ayzenberg <ya...@media.mit.edu> on 2015/08/22 21:57:32 UTC, 1 replies.
- pickling error with PySpark and Elasticsearch-py analyzer - posted by pkphlam <pk...@gmail.com> on 2015/08/23 02:47:12 UTC, 0 replies.
- Re: Using spark streaming to load data from Kafka to HDFS - posted by "Xu (Simon) Chen" <xc...@gmail.com> on 2015/08/23 02:50:38 UTC, 0 replies.
- how to migrate from spark 0.9 to spark 1.4 - posted by sai rakesh <sa...@gmail.com> on 2015/08/23 07:20:15 UTC, 1 replies.
- Error when saving a dataframe as ORC file - posted by lostrain A <do...@gmail.com> on 2015/08/23 10:01:59 UTC, 7 replies.
- How to set environment of worker applications - posted by Jan Algermissen <al...@icloud.com> on 2015/08/23 12:56:59 UTC, 7 replies.
- Re: Spark Mesos Dispatcher - posted by bcajes <br...@gmail.com> on 2015/08/23 16:22:05 UTC, 1 replies.
- How to parse multiple event types using Kafka - posted by Spark Enthusiast <sp...@yahoo.in> on 2015/08/23 17:56:12 UTC, 1 replies.
- Spark YARN executors are not launching when using +UseG1GC - posted by unk1102 <um...@gmail.com> on 2015/08/23 20:49:19 UTC, 0 replies.
- is there a 'knack' to docker and mesos? - posted by Dick Davies <di...@hellooperator.net> on 2015/08/23 21:20:06 UTC, 0 replies.
- B2i Healthcare "Powered by Spark" addition - posted by Brandon Ulrich <bu...@b2i.sg> on 2015/08/23 21:47:55 UTC, 1 replies.
- DataFrame rollup with alias? - posted by Isabelle Phan <nl...@gmail.com> on 2015/08/24 06:13:58 UTC, 0 replies.
- Re: Spark GraphaX - posted by Robineast <Ro...@xense.co.uk> on 2015/08/24 07:07:37 UTC, 0 replies.
- How to remove worker node but let it finish first? - posted by Romi Kuntsman <ro...@totango.com> on 2015/08/24 08:41:32 UTC, 2 replies.
- [Spark Streaming on Mesos (good practices)] - posted by Aram Mkrtchyan <ar...@gmail.com> on 2015/08/24 10:15:24 UTC, 1 replies.
- Drop table and Hive warehouse - posted by Kevin Jung <it...@samsung.com> on 2015/08/24 10:32:07 UTC, 2 replies.
- Joining using mulitimap or array - posted by Ilya Karpov <i....@cleverdata.ru> on 2015/08/24 11:21:40 UTC, 3 replies.
- DataFrame#show cost 2 Spark Jobs ? - posted by Jeff Zhang <zj...@gmail.com> on 2015/08/24 12:19:48 UTC, 8 replies.
- Determinant of Matrix - posted by Naveen <na...@formcept.com> on 2015/08/24 13:10:45 UTC, 1 replies.
- Loading already existing tables in spark shell - posted by Jeetendra Gangele <ga...@gmail.com> on 2015/08/24 14:17:11 UTC, 4 replies.
- Performance - Python streaming v/s Scala streaming - posted by "utk.pat" <ut...@gmail.com> on 2015/08/24 14:22:27 UTC, 2 replies.
- Re: Memory allocation error with Spark 1.5, HashJoinCompatibilitySuite - posted by Adam Roberts <ar...@uk.ibm.com> on 2015/08/24 14:35:59 UTC, 0 replies.
- Difficulties developing a Specs2 matcher for Spark Streaming - posted by Juan Rodríguez Hortalá <ju...@gmail.com> on 2015/08/24 14:47:12 UTC, 0 replies.
- How to evaluate custom UDF over window - posted by xander92 <al...@ompnt.com> on 2015/08/24 15:26:43 UTC, 1 replies.
- DataFrame/JDBC very slow performance - posted by Dhaval Patel <dh...@gmail.com> on 2015/08/24 17:17:33 UTC, 2 replies.
- Got wrong md5sum for boto - posted by Justin Pihony <ju...@gmail.com> on 2015/08/24 17:54:33 UTC, 2 replies.
- Unable to catch SparkContext methods exceptions - posted by Roberto Coluccio <ro...@gmail.com> on 2015/08/24 18:09:11 UTC, 3 replies.
- Spark Direct Streaming With ZK Updates - posted by suchenzang <su...@gmail.com> on 2015/08/24 19:09:34 UTC, 5 replies.
- Local Spark talking to remote HDFS? - posted by Dino Fancellu <di...@felstar.com> on 2015/08/24 20:46:05 UTC, 6 replies.
- History server is not receiving any event - posted by "b.bhavesh" <b....@gmail.com> on 2015/08/24 21:37:07 UTC, 1 replies.
- Array Out OF Bound Exception - posted by "SAHA, DEBOBROTA" <ds...@att.com> on 2015/08/24 21:41:31 UTC, 3 replies.
- spark and scala-2.11 - posted by Lanny Ripple <la...@spotright.com> on 2015/08/24 21:48:21 UTC, 3 replies.
- Run Spark job from within iPython+Spark? - posted by YaoPau <jo...@gmail.com> on 2015/08/24 22:06:50 UTC, 0 replies.
- ExternalSorter: Thread *** spilling in-memory map of 352.6 MB to disk (38 times so far) - posted by "dan@lumity.com" <da...@lumity.com> on 2015/08/24 23:27:51 UTC, 0 replies.
- `show tables like 'tmp*';` does not work in Spark 1.3.0+ - posted by dugdun <du...@hotmail.com> on 2015/08/24 23:44:02 UTC, 0 replies.
- Exclude slf4j-log4j12 from the classpath via spark-submit - posted by Utkarsh Sengar <ut...@gmail.com> on 2015/08/24 23:50:33 UTC, 12 replies.
- Strange ClassNotFoundException in spark-shell - posted by Jan Algermissen <al...@icloud.com> on 2015/08/25 00:00:09 UTC, 0 replies.
- Running spark shell on mesos with zookeeper on spark 1.3.1 - posted by kohlisimranjit <ko...@gmail.com> on 2015/08/25 00:11:04 UTC, 0 replies.
- Where is Redgate's HDFS explorer? - posted by Dino Fancellu <di...@felstar.com> on 2015/08/25 00:13:53 UTC, 5 replies.
- Protobuf error when streaming from Kafka - posted by Cassa L <lc...@gmail.com> on 2015/08/25 01:58:13 UTC, 6 replies.
- What does Attribute and AttributeReference mean in Spark SQL - posted by Todd <bi...@163.com> on 2015/08/25 03:13:51 UTC, 2 replies.
- Spark - posted by Spark Enthusiast <sp...@yahoo.in> on 2015/08/25 06:52:03 UTC, 2 replies.
- Test case for the spark sql catalyst - posted by Todd <bi...@163.com> on 2015/08/25 07:01:20 UTC, 2 replies.
- org.apache.spark.shuffle.FetchFailedException - posted by kundan kumar <ii...@gmail.com> on 2015/08/25 07:36:56 UTC, 2 replies.
- How to access Spark UI through AWS - posted by Justin Pihony <ju...@gmail.com> on 2015/08/25 07:51:43 UTC, 5 replies.
- Exception throws when running spark pi in Intellij Idea that scala.collection.Seq is not found - posted by Todd <bi...@163.com> on 2015/08/25 08:48:40 UTC, 3 replies.
- Re: Spark stages very slow to complete - posted by Olivier Girardot <o....@lateral-thoughts.com> on 2015/08/25 09:39:21 UTC, 0 replies.
- Invalid environment variable name when submitting job from windows - posted by Yann ROBIN <me...@gmail.com> on 2015/08/25 09:55:58 UTC, 1 replies.
- spark not launching in yarn-cluster mode - posted by Jeetendra Gangele <ga...@gmail.com> on 2015/08/25 10:19:39 UTC, 2 replies.
- How to effieciently write sorted neighborhood in pyspark - posted by shahid qadri <sh...@icloud.com> on 2015/08/25 11:45:03 UTC, 1 replies.
- Checkpointing in Iterative Graph Computation - posted by sachintyagi22 <sa...@gmail.com> on 2015/08/25 13:15:18 UTC, 0 replies.
- How to increase data scale in Spark SQL Perf - posted by Todd <bi...@163.com> on 2015/08/25 13:22:15 UTC, 7 replies.
- Select some data from Hive (SparkSQL) directly using NodeJS - posted by Phakin Cheangkrachange <pc...@sertiscorp.com> on 2015/08/25 13:22:27 UTC, 0 replies.
- using Convert function of sql in spark sql - posted by Rajeshkumar J <ra...@gmail.com> on 2015/08/25 13:53:53 UTC, 0 replies.
- SparkSQL saveAsParquetFile does not preserve AVRO schema - posted by storm <pe...@gmail.com> on 2015/08/25 14:13:34 UTC, 1 replies.
- Scala: Overload method by its class type - posted by Sa...@wellsfargo.com on 2015/08/25 15:25:11 UTC, 2 replies.
- Adding/subtracting org.apache.spark.mllib.linalg.Vector in Scala? - posted by Kristina Rogale Plazonic <kp...@gmail.com> on 2015/08/25 15:36:07 UTC, 9 replies.
- Spark-Ec2 lunch failed on starting httpd spark 141 - posted by Garry Chen <gc...@cornell.edu> on 2015/08/25 16:39:01 UTC, 0 replies.
- Re: Spark-Ec2 launch failed on starting httpd spark 141 - posted by Ted Yu <yu...@gmail.com> on 2015/08/25 16:55:41 UTC, 1 replies.
- DataFrame Parquet Writer doesn't keep schema - posted by Petr Novak <os...@gmail.com> on 2015/08/25 17:02:55 UTC, 1 replies.
- Spark RDD join with CassandraRDD - posted by Priya Ch <le...@gmail.com> on 2015/08/25 17:22:51 UTC, 1 replies.
- Pyspark ImportError: No module named definitions - posted by YaoPau <jo...@gmail.com> on 2015/08/25 17:37:57 UTC, 0 replies.
- Spark (1.2.0) submit fails with exception saying log directory already exists - posted by "Varadhan, Jawahar" <va...@yahoo.com.INVALID> on 2015/08/25 18:37:35 UTC, 1 replies.
- CHAID Decision Trees - posted by jatinpreet <ja...@gmail.com> on 2015/08/25 18:39:14 UTC, 3 replies.
- Error:(46, 66) not found: type SparkFlumeProtocol - posted by Muler <mu...@gmail.com> on 2015/08/25 18:50:33 UTC, 0 replies.
- [SQL/Hive] Trouble with refreshTable - posted by Yana Kadiyska <ya...@gmail.com> on 2015/08/25 18:51:50 UTC, 0 replies.
- Re: Spark-Ec2 lunch failed on starting httpd spark 141 - posted by Shivaram Venkataraman <sh...@eecs.berkeley.edu> on 2015/08/25 18:52:59 UTC, 0 replies.
- SparkR: exported functions - posted by Colin Gillespie <cs...@gmail.com> on 2015/08/25 19:26:50 UTC, 1 replies.
- Spark Streaming Checkpointing Restarts with 0 Event Batches - posted by suchenzang <su...@gmail.com> on 2015/08/25 19:53:28 UTC, 12 replies.
- How to unit test HiveContext without OutOfMemoryError (using sbt) - posted by Mike Trienis <mi...@orcsol.com> on 2015/08/25 20:10:51 UTC, 3 replies.
- Fwd: Join with multiple conditions (In reference to SPARK-7197) - posted by Michal Monselise <mi...@gmail.com> on 2015/08/25 20:21:05 UTC, 2 replies.
- SparkSQL problem with IBM BigInsight V3 - posted by java8964 <ja...@hotmail.com> on 2015/08/25 22:41:07 UTC, 0 replies.
- Persisting sorted parquet tables for future sort merge joins - posted by Jason <Ja...@jasonknight.us> on 2015/08/26 02:11:37 UTC, 1 replies.
- Spark thrift server on yarn - posted by Udit Mehta <um...@groupon.com> on 2015/08/26 02:19:17 UTC, 2 replies.
- Question on take function - Spark Java API - posted by Pankaj Wahane <pa...@qiotec.com> on 2015/08/26 04:55:13 UTC, 2 replies.
- Re: use GraphX with Spark Streaming - posted by ponkin <al...@ya.ru> on 2015/08/26 06:44:10 UTC, 0 replies.
- reduceByKey not working on JavaPairDStream - posted by Deepesh Maheshwari <de...@gmail.com> on 2015/08/26 08:05:48 UTC, 1 replies.
- BlockNotFoundException when running spark word count on Tachyon - posted by Todd <bi...@163.com> on 2015/08/26 08:55:26 UTC, 2 replies.
- SPARK_DIST_CLASSPATH, primordial class loader & app ClassNotFound - posted by Night Wolf <ni...@gmail.com> on 2015/08/26 09:54:11 UTC, 0 replies.
- Relation between threads and executor core - posted by Samya <sa...@amadeus.com> on 2015/08/26 10:47:42 UTC, 3 replies.
- Performance issue with Spark join - posted by lucap <lu...@hotmail.it> on 2015/08/26 11:38:24 UTC, 1 replies.
- ClassCastException using DataFrame only when num-executors > 2 ... - posted by Olivier Girardot <ss...@gmail.com> on 2015/08/26 11:47:55 UTC, 1 replies.
- JobScheduler: Error generating jobs for time for custom InputDStream - posted by Juan Rodríguez Hortalá <ju...@gmail.com> on 2015/08/26 13:30:35 UTC, 0 replies.
- Build k-NN graph for large dataset - posted by Jaonary Rabarisoa <ja...@gmail.com> on 2015/08/26 13:35:45 UTC, 6 replies.
- Custom Offset Management - posted by Deepesh Maheshwari <de...@gmail.com> on 2015/08/26 14:55:02 UTC, 1 replies.
- Spark-on-YARN LOCAL_DIRS location - posted by mi...@nomura.com on 2015/08/26 15:26:19 UTC, 1 replies.
- Spark Job Stuck in on stage. - posted by Akash Mishra <ak...@gmail.com> on 2015/08/26 15:37:14 UTC, 2 replies.
- Setting number of CORES from inside the Topology (JAVA code ) - posted by anshu shukla <an...@gmail.com> on 2015/08/26 15:56:54 UTC, 1 replies.
- Fwd: Issue with building Spark v1.4.1-rc4 with Scala 2.11 - posted by Felix Neutatz <ne...@googlemail.com> on 2015/08/26 16:07:00 UTC, 1 replies.
- application logs for long lived job on YARN - posted by Chen Song <ch...@gmail.com> on 2015/08/26 16:37:55 UTC, 1 replies.
- Spark 1.3.1 saveAsParquetFile hangs on app exit - posted by cingram <ci...@gmail.com> on 2015/08/26 16:46:03 UTC, 2 replies.
- Re: JDBC Streams - posted by Chen Song <ch...@gmail.com> on 2015/08/26 16:46:07 UTC, 4 replies.
- Building spark-examples takes too much time using Maven - posted by Muhammad Haseeb Javed <11...@seecs.edu.pk> on 2015/08/26 16:56:00 UTC, 1 replies.
- spark streaming 1.3 kafka buffer size - posted by Shushant Arora <sh...@gmail.com> on 2015/08/26 17:39:33 UTC, 3 replies.
- Efficient sampling from a Hive table - posted by Thomas Dudziak <to...@gmail.com> on 2015/08/26 17:53:38 UTC, 3 replies.
- Just Released V1.0.4 Low Level Receiver Based Kafka-Spark-Consumer in Spark Packages having built-in Back Pressure Controller - posted by Dibyendu Bhattacharya <di...@gmail.com> on 2015/08/26 18:32:00 UTC, 0 replies.
- Spark cluster multi tenancy - posted by Sadhan Sood <sa...@gmail.com> on 2015/08/26 19:45:28 UTC, 3 replies.
- Can RDD be shared accross the cluster by other drivers? - posted by Tao Lu <ta...@gmail.com> on 2015/08/26 19:47:31 UTC, 1 replies.
- Feedback: Feature request - posted by "Murphy, James" <Ja...@disney.com> on 2015/08/26 20:29:04 UTC, 4 replies.
- query avro hive table in spark sql - posted by gpatcham <gp...@gmail.com> on 2015/08/26 20:32:16 UTC, 9 replies.
- Dataframe collect() work but count() fails - posted by Srikanth <sr...@gmail.com> on 2015/08/26 22:41:25 UTC, 0 replies.
- suggest configuration for debugging spark streaming, kafka - posted by Joanne Contact <jo...@gmail.com> on 2015/08/26 23:02:46 UTC, 1 replies.
- Does the driver program always run local to where you submit the job from? - posted by Jerry <je...@gmail.com> on 2015/08/26 23:03:27 UTC, 2 replies.
- Help! Stuck using withColumn - posted by Sa...@wellsfargo.com on 2015/08/26 23:47:21 UTC, 3 replies.
- Spark.ml vs Spark.mllib - posted by njoshi <ni...@teamaol.com> on 2015/08/27 00:34:30 UTC, 0 replies.
- error accessing vertexRDD - posted by dizzy5112 <da...@gmail.com> on 2015/08/27 02:45:20 UTC, 1 replies.
- Re: Differing performance in self joins - posted by Michael Armbrust <mi...@databricks.com> on 2015/08/27 03:27:37 UTC, 0 replies.
- spark streaming 1.3 kafka topic error - posted by Shushant Arora <sh...@gmail.com> on 2015/08/27 04:07:29 UTC, 5 replies.
- [ERROR] Failed to execute goal net.alchim31.maven:scala-maven-plugin:3.2.2:testCompile - posted by Jacek Laskowski <ja...@japila.pl> on 2015/08/27 09:51:26 UTC, 3 replies.
- Selecting different levels of nested data records during one select? - posted by Ewan Leith <ew...@realitymine.com> on 2015/08/27 11:08:50 UTC, 1 replies.
- Driver running out of memory - caused by many tasks? - posted by an...@thomsonreuters.com on 2015/08/27 12:53:16 UTC, 4 replies.
- sbt error -- before Terasort compilation - posted by Shreeharsha G Neelakantachar <sh...@in.ibm.com> on 2015/08/27 14:59:01 UTC, 2 replies.
- spark-submit issue - posted by pranay <pr...@impetus.co.in> on 2015/08/27 15:11:28 UTC, 10 replies.
- commit DB Transaction for each partition - posted by Ahmed Nawar <ah...@gmail.com> on 2015/08/27 16:02:34 UTC, 2 replies.
- Adding Kafka topics to a running streaming context - posted by yael aharon <ya...@gmail.com> on 2015/08/27 16:19:24 UTC, 3 replies.
- Best way to filter null on "any" column? - posted by Sa...@wellsfargo.com on 2015/08/27 16:46:40 UTC, 0 replies.
- Writing test case for spark streaming checkpointing - posted by Hafiz Mujadid <ha...@gmail.com> on 2015/08/27 17:14:54 UTC, 1 replies.
- Is there a way to store RDD and load it with its original format? - posted by Sa...@wellsfargo.com on 2015/08/27 17:59:59 UTC, 1 replies.
- Spark driver locality - posted by Swapnil Shinde <sw...@gmail.com> on 2015/08/27 18:30:12 UTC, 4 replies.
- Getting number of physical machines in Spark - posted by "Young, Matthew T" <ma...@intel.com> on 2015/08/27 19:01:35 UTC, 2 replies.
- Spark Streaming Listener to Kill Stages? - posted by suchenzang <su...@gmail.com> on 2015/08/27 20:14:43 UTC, 0 replies.
- Commit DB Transaction for each partition - posted by Ahmed Nawar <ah...@gmail.com> on 2015/08/27 20:42:43 UTC, 4 replies.
- Data Frame support CSV or excel format ? - posted by spark user <sp...@yahoo.com.INVALID> on 2015/08/27 20:48:29 UTC, 1 replies.
- Porting a multit-hreaded compute intensive job to spark - posted by Utkarsh Sengar <ut...@gmail.com> on 2015/08/27 22:32:09 UTC, 0 replies.
- types allowed for saveasobjectfile? - posted by Arun Luthra <ar...@gmail.com> on 2015/08/27 23:08:05 UTC, 4 replies.
- Any quick method to sample rdd based on one filed? - posted by Gavin Yue <yu...@gmail.com> on 2015/08/27 23:27:03 UTC, 2 replies.
- tweet transformation ideas - posted by Jesse F Chen <jf...@us.ibm.com> on 2015/08/27 23:46:21 UTC, 2 replies.
- TimeoutException on start-slave spark 1.4.0 - posted by Alexander Pivovarov <ap...@gmail.com> on 2015/08/28 00:07:38 UTC, 1 replies.
- Spark Taking too long on K-means clustering - posted by masoom alam <ma...@wanclouds.net> on 2015/08/28 00:16:24 UTC, 0 replies.
- Array column stored as “.bag” in parquet file instead of “REPEATED INT64" - posted by Jim Green <op...@gmail.com> on 2015/08/28 00:53:02 UTC, 1 replies.
- How to avoid shuffle errors for a large join ? - posted by Thomas Dudziak <to...@gmail.com> on 2015/08/28 03:03:56 UTC, 7 replies.
- How to increase the Json parsing speed - posted by Gavin Yue <yu...@gmail.com> on 2015/08/28 03:58:55 UTC, 7 replies.
- Graphx CompactBuffer help - posted by smagadi <su...@fico.com> on 2015/08/28 08:40:29 UTC, 1 replies.
- Re: Is there any way to connect cassandra without spark-cassandra connector? - posted by Hafiz Mujadid <ha...@gmail.com> on 2015/08/28 08:43:07 UTC, 0 replies.
- RDD from partitions - posted by Jem Tucker <je...@gmail.com> on 2015/08/28 09:03:53 UTC, 2 replies.
- Job aborted due to stage failure: java.lang.StringIndexOutOfBoundsException: String index out of range: 18 - posted by "ouruia@cnsuning.com" <ou...@cnsuning.com> on 2015/08/28 11:08:31 UTC, 4 replies.
- How to determine a good set of parameters for a ML grid search task? - posted by Adamantios Corais <ad...@gmail.com> on 2015/08/28 11:16:40 UTC, 0 replies.
- Spark Version upgrade isue:Exception in thread "main" java.lang.NoSuchMethodError - posted by Manohar753 <ma...@happiestminds.com> on 2015/08/28 12:31:41 UTC, 2 replies.
- Alternative to Large Broadcast Variables - posted by Hemminger Jeff <je...@atware.co.jp> on 2015/08/28 12:39:27 UTC, 5 replies.
- How to compute the probability of each class in Naive Bayes - posted by Adamantios Corais <ad...@gmail.com> on 2015/08/28 14:38:04 UTC, 0 replies.
- Calculating Min and Max Values using Spark Transformations? - posted by ashensw <as...@wso2.com> on 2015/08/28 14:39:53 UTC, 6 replies.
- Why transformer from ml.Pipeline transform only a DataFrame ? - posted by Jaonary Rabarisoa <ja...@gmail.com> on 2015/08/28 15:38:55 UTC, 1 replies.
- Feasibility Project - Text Processing and Category Classification - posted by Darksu <ni...@hotmail.com> on 2015/08/28 16:15:12 UTC, 2 replies.
- correct use of DStream foreachRDD - posted by Carol McDonald <cm...@maprtech.com> on 2015/08/28 16:29:31 UTC, 3 replies.
- Kmeans issues and hierarchical clustering - posted by Robust_spark <su...@gmail.com> on 2015/08/28 17:03:17 UTC, 0 replies.
- how to register CompactBuffer in Kryo - posted by donhoff_h <16...@qq.com> on 2015/08/28 17:25:25 UTC, 1 replies.
- SSL between Kafka and Spark Streaming API - posted by Cassa L <lc...@gmail.com> on 2015/08/28 20:00:18 UTC, 4 replies.
- Dynamic lookup table - posted by N B <nb...@gmail.com> on 2015/08/28 20:38:47 UTC, 2 replies.
- Help Explain Tasks in WebUI:4040 - posted by Muler <mu...@gmail.com> on 2015/08/28 20:47:02 UTC, 3 replies.
- Support for Hive Storage Handle in Spark SQL and core Spark - posted by Sourav Mazumder <so...@gmail.com> on 2015/08/28 22:26:20 UTC, 0 replies.
- How to send RDD result to REST API? - posted by Cassa L <lc...@gmail.com> on 2015/08/29 06:35:13 UTC, 3 replies.
- Apache Spark Suitable JDBC Driver not found - posted by shawon <sh...@gmail.com> on 2015/08/29 12:13:30 UTC, 1 replies.
- Spark Effects of Driver Memory, Executor Memory, Driver Memory Overhead and Executor Memory Overhead on success of job runs - posted by timothy22000 <ti...@gmail.com> on 2015/08/30 03:57:50 UTC, 1 replies.
- Spark shell and StackOverFlowError - posted by ashrowty <as...@gmail.com> on 2015/08/30 05:21:36 UTC, 18 replies.
- How to generate spark assembly (jar file) using Intellij - posted by Muler <mu...@gmail.com> on 2015/08/30 06:03:36 UTC, 1 replies.
- Spark MLLIB multiclass calssification - posted by Zsombor Egyed <eg...@starschema.net> on 2015/08/30 06:23:27 UTC, 3 replies.
- Re: Spark Python with SequenceFile containing numpy deserialized data in str form - posted by Peter Aberline <pe...@gmail.com> on 2015/08/30 15:20:54 UTC, 0 replies.
- submit_spark_job_to_YARN - posted by Ajay Chander <it...@gmail.com> on 2015/08/30 17:21:57 UTC, 5 replies.
- Where is the doc about the spark rest api ? - posted by canan chen <cc...@gmail.com> on 2015/08/31 05:24:47 UTC, 3 replies.
- Spark SQL vs Spark Programming - posted by satish chandra j <js...@gmail.com> on 2015/08/31 06:07:16 UTC, 2 replies.
- Re: Unable to build Spark 1.5, is build broken or can anyone successfully build? - posted by Kevin Jung <it...@samsung.com> on 2015/08/31 07:27:59 UTC, 2 replies.
- Distance Calculation in Spark K means clustering - posted by ashensw <as...@wso2.com> on 2015/08/31 08:53:16 UTC, 0 replies.
- Slow Mongo Read from Spark - posted by Deepesh Maheshwari <de...@gmail.com> on 2015/08/31 08:56:17 UTC, 6 replies.
- Standalone mode: is SPARK_WORKER_MEMORY per SPARK_WORKER_INSTANCE? - posted by Muler <mu...@gmail.com> on 2015/08/31 10:06:06 UTC, 1 replies.
- [MLlib] DIMSUM row similarity? - posted by Maandy <dy...@gmail.com> on 2015/08/31 10:17:39 UTC, 1 replies.
- Data Security on Spark-on-HDFS - posted by Daniel Schulz <da...@hotmail.com> on 2015/08/31 12:02:05 UTC, 1 replies.
- Write Concern used in Mongo-Hadoop Connector - posted by Deepesh Maheshwari <de...@gmail.com> on 2015/08/31 13:39:08 UTC, 1 replies.
- Reading xml in java using spark - posted by rakesh sharma <ra...@hotmail.com> on 2015/08/31 13:40:24 UTC, 2 replies.
- Parallel execution of RDDs - posted by Brian Parker <as...@gmail.com> on 2015/08/31 15:51:53 UTC, 1 replies.
- start master failed with error - posted by Garry Chen <gc...@cornell.edu> on 2015/08/31 18:02:38 UTC, 1 replies.
- Managing httpcomponent dependency in Spark/Solr - posted by Oliver Schrenk <ol...@gmail.com> on 2015/08/31 18:33:41 UTC, 0 replies.
- Spark executor OOM issue on YARN - posted by unk1102 <um...@gmail.com> on 2015/08/31 20:03:35 UTC, 2 replies.
- Potential NPE while exiting spark-shell - posted by nasokan <an...@gmail.com> on 2015/08/31 20:39:44 UTC, 2 replies.
- Too many open files issue - posted by Sigurd Knippenberg <si...@knippenberg.com> on 2015/08/31 20:49:59 UTC, 0 replies.
- Is it possible to create spark cluster in different network? - posted by sakana <ma...@gmail.com> on 2015/08/31 20:59:48 UTC, 0 replies.
- Problems with Tungsten in Spark 1.5.0-rc2 - posted by Anders Arpteg <ar...@spotify.com> on 2015/08/31 21:34:16 UTC, 0 replies.
- Exceptions in threads in executor code don't get caught properly - posted by Wayne Song <wa...@gmail.com> on 2015/08/31 21:52:33 UTC, 0 replies.
- Checkpointing in Spark without Streaming - posted by Ian Wood <ib...@gmail.com> on 2015/08/31 22:39:54 UTC, 0 replies.
- Parsing nested json objects with variable structure - posted by SK <sk...@gmail.com> on 2015/08/31 22:56:12 UTC, 0 replies.
- Window Sliding In spark - posted by pa...@qiotec.com on 2015/08/31 23:18:21 UTC, 0 replies.