user@spark.apache.org, 2015-11

You are viewing a plain text version of this content. The canonical link for it is here.

- Re: Programatically create RDDs based on input - posted by Natu Lauchande <nl...@gmail.com> on 2015/11/01 06:28:48 UTC, 1 replies.
- Re: If you use Spark 1.5 and disabled Tungsten mode ... - posted by Reynold Xin <rx...@databricks.com> on 2015/11/01 08:22:39 UTC, 1 replies.
- Re: How to lookup by a key in an RDD - posted by Gylfi <gy...@berkeley.edu> on 2015/11/01 09:21:51 UTC, 5 replies.
- Re: job hangs when using pipe() with reduceByKey() - posted by Gylfi <gy...@berkeley.edu> on 2015/11/01 09:33:05 UTC, 1 replies.
- Re: Spark 1.5 on CDH 5.4.0 - posted by Deenar Toraskar <de...@gmail.com> on 2015/11/01 15:50:49 UTC, 0 replies.
- Some spark apps fail with "All masters are unresponsive", while others pass normally - posted by Romi Kuntsman <ro...@totango.com> on 2015/11/01 17:08:23 UTC, 4 replies.
- [Spark MLlib] about linear regression issue - posted by Zhiliang Zhu <zc...@yahoo.com.INVALID> on 2015/11/01 18:22:28 UTC, 2 replies.
- apply simplex method to fix linear programming in spark - posted by Zhiliang Zhu <zc...@yahoo.com.INVALID> on 2015/11/01 18:37:13 UTC, 9 replies.
- spark sql partitioned by date... read last date - posted by Koert Kuipers <ko...@tresata.com> on 2015/11/01 21:03:35 UTC, 8 replies.
- Occasionally getting RpcTimeoutException - posted by Jake Yoon <su...@gmail.com> on 2015/11/01 23:50:57 UTC, 0 replies.
- Re: issue with spark.driver.maxResultSize parameter in spark 1.3 - posted by karthik kadiyam <ka...@gmail.com> on 2015/11/02 03:47:11 UTC, 1 replies.
- Sort Merge Join - posted by Alex Nastetsky <al...@vervemobile.com> on 2015/11/02 04:29:26 UTC, 4 replies.
- Spark Streaming data checkpoint performance - posted by Thúy Hằng Lê <th...@gmail.com> on 2015/11/02 06:20:46 UTC, 10 replies.
- RE: How to set memory for SparkR with master="local[*]" - posted by "Sun, Rui" <ru...@intel.com> on 2015/11/02 06:47:35 UTC, 0 replies.
- Re: Caching causes later actions to get stuck - posted by Sampo Niskanen <sa...@wellmo.com> on 2015/11/02 07:33:08 UTC, 0 replies.
- Error : - No filesystem for scheme: spark - posted by "Balachandar R.A." <ba...@gmail.com> on 2015/11/02 07:48:22 UTC, 8 replies.
- Re: Running 2 spark application in parallel - posted by Akhil Das <ak...@sigmoidanalytics.com> on 2015/11/02 07:59:46 UTC, 0 replies.
- RE: sparkR 1.5.1 batch yarn-client mode failing on daemon.R not found - posted by "Sun, Rui" <ru...@intel.com> on 2015/11/02 08:01:15 UTC, 1 replies.
- Re: spark.python.worker.memory Discontinuity - posted by Akhil Das <ak...@sigmoidanalytics.com> on 2015/11/02 08:11:38 UTC, 0 replies.
- Re: Spark Streaming: how to StreamingContext.queueStream - posted by Akhil Das <ak...@sigmoidanalytics.com> on 2015/11/02 08:25:08 UTC, 0 replies.
- Re: streaming.twitter.TwitterUtils what is the best way to save twitter status to HDFS? - posted by Akhil Das <ak...@sigmoidanalytics.com> on 2015/11/02 08:31:30 UTC, 0 replies.
- Re: Unable to use saveAsSequenceFile - posted by Akhil Das <ak...@sigmoidanalytics.com> on 2015/11/02 08:34:40 UTC, 0 replies.
- Re: java how to configure streaming.dstream.DStream<> saveAsTextFiles() to work with hdfs? - posted by Akhil Das <ak...@sigmoidanalytics.com> on 2015/11/02 08:38:07 UTC, 0 replies.
- RE: SparkR job with >200 tasks hangs when calling from web server - posted by "Sun, Rui" <ru...@intel.com> on 2015/11/02 08:53:30 UTC, 0 replies.
- Re: Submitting Spark Applications - Do I need to leave ports open? - posted by Akhil Das <ak...@sigmoidanalytics.com> on 2015/11/02 09:38:20 UTC, 0 replies.
- Re: How to catch error during Spark job? - posted by Akhil Das <ak...@sigmoidanalytics.com> on 2015/11/02 10:18:00 UTC, 2 replies.
- Re: --jars option using hdfs jars cannot effect when spark standlone deploymode with cluster - posted by Akhil Das <ak...@sigmoidanalytics.com> on 2015/11/02 10:29:00 UTC, 2 replies.
- Required file not found: sbt-interface.jar - posted by Todd <bi...@163.com> on 2015/11/02 10:51:46 UTC, 1 replies.
- Split RDD into multiple RDDs using filter-transformation - posted by Sushrut Ikhar <su...@gmail.com> on 2015/11/02 10:53:59 UTC, 1 replies.
- Spark Streaming and periodic broadcast - posted by Serafín Sedano Arenas <se...@gmail.com> on 2015/11/02 14:16:00 UTC, 0 replies.
- Re: Best practises - posted by satish chandra j <js...@gmail.com> on 2015/11/02 14:18:33 UTC, 3 replies.
- Re: Exception while reading from kafka stream - posted by Ramkumar V <ra...@gmail.com> on 2015/11/02 14:26:53 UTC, 2 replies.
- Why does sortByKey() transformation trigger a job in spark-shell? - posted by Jacek Laskowski <ja...@japila.pl> on 2015/11/02 14:34:29 UTC, 2 replies.
- Spark, Mesos problems with remote connections - posted by Sebastian Kuepers <se...@publicispixelpark.de> on 2015/11/02 15:24:20 UTC, 1 replies.
- execute native system commands in Spark - posted by patcharee <Pa...@uni.no> on 2015/11/02 16:36:47 UTC, 2 replies.
- FW: Spark streaming - failed recovery from checkpoint - posted by Adrian Tanase <at...@adobe.com> on 2015/11/02 16:40:40 UTC, 0 replies.
- [Yarn] How to set user in ContainerLaunchContext? - posted by Peter Rudenko <pe...@gmail.com> on 2015/11/02 17:02:45 UTC, 1 replies.
- Re: PySpark + Streaming + DataFrames - posted by Jason White <ja...@shopify.com> on 2015/11/02 20:00:17 UTC, 0 replies.
- Where does mllib's .save method save a model to? - posted by "apu mishra . rr" <ap...@gmail.com> on 2015/11/02 20:19:39 UTC, 2 replies.
- kinesis batches hang after YARN automatic driver restart - posted by Hster Geguri <hs...@gmail.com> on 2015/11/02 20:26:21 UTC, 2 replies.
- SparkSQL implicit conversion on insert - posted by Bryan Jeffrey <br...@gmail.com> on 2015/11/02 20:40:44 UTC, 1 replies.
- Fwd: Getting ClassNotFoundException: scala.Some on Spark 1.5.x - posted by Babar Tareen <ba...@gmail.com> on 2015/11/02 20:48:31 UTC, 3 replies.
- Standalone cluster not using multiple workers for single application - posted by Jeff Jones <jj...@adaptivebiotech.com> on 2015/11/02 20:56:13 UTC, 2 replies.
- Spark SQL lag() window function, strange behavior - posted by Ro...@thomsonreuters.com on 2015/11/02 21:07:36 UTC, 2 replies.
- Dump table into file - posted by Shepherd <Ch...@huawei.com> on 2015/11/02 21:20:43 UTC, 1 replies.
- Time-series prediction using spark - posted by Cui Lin <ic...@gmail.com> on 2015/11/02 21:29:00 UTC, 0 replies.
- How to handle Option[Int] in dataframe - posted by manas kar <po...@gmail.com> on 2015/11/02 21:30:53 UTC, 1 replies.
- upgrading from spark 1.4 to latest version - posted by roni <ro...@gmail.com> on 2015/11/03 00:27:54 UTC, 0 replies.
- ipython notebook NameError: name 'sc' is not defined - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2015/11/03 01:04:13 UTC, 1 replies.
- ClassNotFoundException even if class is present in Jarfile - posted by hveiga <ke...@gmail.com> on 2015/11/03 01:43:15 UTC, 2 replies.
- Spark executor jvm classloader not able to load nested jars - posted by Nirav Patel <np...@xactlycorp.com> on 2015/11/03 01:44:22 UTC, 0 replies.
- Does the Standalone cluster and Applications need to be same Spark version? - posted by pnpritchard <ni...@falkonry.com> on 2015/11/03 03:11:28 UTC, 2 replies.
- Re: callUdf("percentile_approx",col("mycol"),lit(0.25)) does not compile spark 1.5.1 source but it does work in spark 1.5.1 bin - posted by Umesh Kacha <um...@gmail.com> on 2015/11/03 08:01:45 UTC, 0 replies.
- Re: How do I get the executor ID from running Java code - posted by Gideon <gi...@volcanodata.com> on 2015/11/03 08:34:17 UTC, 0 replies.
- Re: Prevent partitions from moving - posted by Akhil Das <ak...@sigmoidanalytics.com> on 2015/11/03 12:08:56 UTC, 0 replies.
- Re: Apache Spark on Raspberry Pi Cluster with Docker - posted by Akhil Das <ak...@sigmoidanalytics.com> on 2015/11/03 12:12:59 UTC, 0 replies.
- Re: Issue of Hive parquet partitioned table schema mismatch - posted by Rex Xiong <by...@gmail.com> on 2015/11/03 12:14:30 UTC, 4 replies.
- Re: Improve parquet write speed to HDFS and spark.sql.execution.id is already set ERROR - posted by Ted Yu <yu...@gmail.com> on 2015/11/03 13:48:21 UTC, 1 replies.
- How to enable debug in Spark Streaming? - posted by diplomatic Guru <di...@gmail.com> on 2015/11/03 14:29:35 UTC, 1 replies.
- Re: spark read data from aws s3 - posted by hveiga <ke...@gmail.com> on 2015/11/03 15:00:40 UTC, 0 replies.
- kerberos question - posted by Chen Song <ch...@gmail.com> on 2015/11/03 17:57:42 UTC, 3 replies.
- Limit the size of /tmp/[...].inprogress files in Spark Streaming - posted by Mathieu Garstecki <ma...@octo.com> on 2015/11/03 18:05:08 UTC, 0 replies.
- RE: Very slow performance on very small record counts - posted by "Young, Matthew T" <ma...@intel.com> on 2015/11/03 18:17:38 UTC, 1 replies.
- Vague Spark SQL error message with saveAsParquetFile - posted by YaoPau <jo...@gmail.com> on 2015/11/03 18:23:19 UTC, 1 replies.
- bin/pyspark SparkContext is missing? - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2015/11/03 18:59:23 UTC, 2 replies.
- best practices machine learning with python 2 or 3? - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2015/11/03 19:32:12 UTC, 0 replies.
- Frozen exception while dynamically creating classes inside Spark using JavaAssist API - posted by Rachana Srivastava <Ra...@markmonitor.com> on 2015/11/03 19:52:29 UTC, 0 replies.
- collect() local faster than 4 node cluster - posted by Sebastian Kuepers <se...@publicispixelpark.de> on 2015/11/03 20:07:28 UTC, 2 replies.
- Spark dynamic allocation config - posted by billou2k <bi...@googlemail.com> on 2015/11/03 20:08:57 UTC, 1 replies.
- Spark Streaming saveAsTextFiles to Amazon S3 - posted by Yuan Zhang <ni...@gmail.com> on 2015/11/03 21:16:37 UTC, 0 replies.
- Why some executors are lazy? - posted by Khaled Ammar <kh...@gmail.com> on 2015/11/03 22:43:49 UTC, 3 replies.
- Support Ordering on UserDefinedType - posted by Ionized <io...@gmail.com> on 2015/11/03 23:20:09 UTC, 1 replies.
- error with saveAsTextFile in local directory - posted by Jack Yang <ji...@uow.edu.au> on 2015/11/04 00:07:03 UTC, 2 replies.
- Please reply if you use Mesos fine grained mode - posted by Reynold Xin <rx...@databricks.com> on 2015/11/04 00:54:06 UTC, 5 replies.
- Upgrade spark cluster to latest version - posted by roni <ro...@gmail.com> on 2015/11/04 00:58:23 UTC, 1 replies.
- New Apache Spark Meetup NRW, Germany - posted by pchundi <ma...@gmail.com> on 2015/11/04 01:18:24 UTC, 0 replies.
- Rule Engine for Spark - posted by Cassa L <lc...@gmail.com> on 2015/11/04 01:42:20 UTC, 7 replies.
- how to get Spark stage DAGs thru the REST APIs? - posted by Xiaoyong Zhu <xi...@microsoft.com> on 2015/11/04 02:47:00 UTC, 0 replies.
- Spark 1.5.1 on Mesos NO Executor Java Options - posted by Jo Voordeckers <jo...@gmail.com> on 2015/11/04 04:59:26 UTC, 0 replies.
- PMML version in MLLib - posted by Fazlan Nazeem <fa...@wso2.com> on 2015/11/04 06:39:11 UTC, 9 replies.
- What does "write time" means exactly in Spark UI? - posted by Khaled Ammar <kh...@gmail.com> on 2015/11/04 07:21:37 UTC, 0 replies.
- Checkpoint not working after driver restart - posted by vimal dinakaran <vi...@gmail.com> on 2015/11/04 07:57:18 UTC, 1 replies.
- dataframe slow down with tungsten turn on - posted by gen tang <ge...@gmail.com> on 2015/11/04 08:54:40 UTC, 2 replies.
- Dynamic (de)allocation with Spark Streaming - posted by Wojciech Pituła <w....@gmail.com> on 2015/11/04 09:05:38 UTC, 0 replies.
- Codegen In Shuffle - posted by 牛兆捷 <nz...@gmail.com> on 2015/11/04 09:21:36 UTC, 2 replies.
- spark filter function - posted by Zhiliang Zhu <zc...@yahoo.com.INVALID> on 2015/11/04 09:34:26 UTC, 0 replies.
- Running Apache Spark 1.5.1 on console2 - posted by Hitoshi Ozawa <oz...@worksap.co.jp> on 2015/11/04 09:40:04 UTC, 0 replies.
- Avoid RDD.saveAsTextFile() generating empty part-* and .crc files - posted by amit tewari <am...@gmail.com> on 2015/11/04 11:05:49 UTC, 0 replies.
- Prevent possible out of memory when using read/union - posted by Alexander Lenz <al...@lenz.tk> on 2015/11/04 11:12:37 UTC, 1 replies.
- Custom application.conf on spark executor nodes? - posted by Adrian Tanase <at...@adobe.com> on 2015/11/04 13:39:00 UTC, 0 replies.
- SparkSQL JDBC to PostGIS - posted by Mustafa Elbehery <el...@gmail.com> on 2015/11/04 13:46:35 UTC, 2 replies.
- Re: Parsing a large XML file using Spark - posted by Jin <gu...@gmail.com> on 2015/11/04 13:50:06 UTC, 0 replies.
- Allow multiple SparkContexts in Unit Testing - posted by Priya Ch <le...@gmail.com> on 2015/11/04 13:59:55 UTC, 3 replies.
- Looking for the method executors uses to write to HDFS - posted by Tóth Zoltán <tz...@looper.hu> on 2015/11/04 14:11:43 UTC, 1 replies.
- Distributing Python code packaged as tar balls - posted by Praveen Chundi <ma...@gmail.com> on 2015/11/04 14:40:52 UTC, 2 replies.
- Problem using BlockMatrix.add - posted by Kareem Sorathia <ka...@gmail.com> on 2015/11/04 16:22:49 UTC, 1 replies.
- Spark 1.5.1 Dynamic Resource Allocation - posted by tstewart <st...@yahoo.com> on 2015/11/04 17:10:45 UTC, 4 replies.
- Cassandra via SparkSQL/Hive JDBC - posted by Bryan Jeffrey <br...@gmail.com> on 2015/11/04 17:16:57 UTC, 11 replies.
- Spark driver, Docker, and Mesos - posted by "PHELIPOT, REMY" <re...@atos.net> on 2015/11/04 17:24:38 UTC, 1 replies.
- DataFrame.toJavaRDD cause fetching data to driver, is it expected ? - posted by Aliaksei Tsyvunchyk <at...@exadel.com> on 2015/11/04 17:51:19 UTC, 4 replies.
- Is the resources specified in configuration shared by all jobs? - posted by Nisrina Luthfiyati <ni...@gmail.com> on 2015/11/04 18:24:35 UTC, 3 replies.
- SPARK_SSH_FOREGROUND format - posted by Kayode Odeyemi <dr...@gmail.com> on 2015/11/04 19:36:07 UTC, 0 replies.
- Executor app-20151104202102-0000 finished with state EXITED - posted by Kayode Odeyemi <dr...@gmail.com> on 2015/11/04 19:47:05 UTC, 4 replies.
- [Spark 1.5]: Exception in thread "broadcast-hash-join-2" java.lang.OutOfMemoryError: Java heap space - posted by Shuai Zheng <sz...@gmail.com> on 2015/11/04 21:21:52 UTC, 0 replies.
- PairRDD from SQL - posted by pratik khadloya <ti...@gmail.com> on 2015/11/04 21:44:33 UTC, 1 replies.
- RE: [Spark 1.5]: Exception in thread "broadcast-hash-join-2" java.lang.OutOfMemoryError: Java heap space -- Work in 1.4, but 1.5 doesn't - posted by Shuai Zheng <sz...@gmail.com> on 2015/11/04 23:55:48 UTC, 0 replies.
- Efficient approach to store an RDD as a file in HDFS and read it back as an RDD? - posted by swetha <sw...@gmail.com> on 2015/11/05 00:09:47 UTC, 7 replies.
- Futures timed out after [120 seconds]. - posted by Kayode Odeyemi <dr...@gmail.com> on 2015/11/05 00:53:56 UTC, 0 replies.
- ExecutorId in JAVA_OPTS - posted by "surbhi.mungre" <mu...@gmail.com> on 2015/11/05 00:55:17 UTC, 0 replies.
- Memory are not used according to setting - posted by William Li <a-...@expedia.com> on 2015/11/05 00:55:54 UTC, 1 replies.
- Protobuff 3.0 for Spark - posted by Cassa L <lc...@gmail.com> on 2015/11/05 01:07:31 UTC, 4 replies.
- How to unpersist a DStream in Spark Streaming - posted by swetha <sw...@gmail.com> on 2015/11/05 02:03:56 UTC, 6 replies.
- how to run RStudio or RStudio Server on ec2 cluster? - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2015/11/05 02:11:14 UTC, 1 replies.
- Question about Spark shuffle read size - posted by Dogtail L <sp...@gmail.com> on 2015/11/05 02:33:00 UTC, 0 replies.
- Spark reading from S3 getting very slow - posted by Younes Naguib <Yo...@tritondigital.com> on 2015/11/05 03:03:19 UTC, 1 replies.
- streaming+sql with block has been removed error - posted by ZhuGe <tc...@outlook.com> on 2015/11/05 09:27:34 UTC, 0 replies.
- Scheduling Spark process - posted by danilo <da...@gmail.com> on 2015/11/05 09:48:21 UTC, 3 replies.
- DataFrame equality does not working in 1.5.1 - posted by 千成徳 <s....@opt.ne.jp> on 2015/11/05 10:12:33 UTC, 6 replies.
- How to run parallel on each DataFrame group - posted by patcharee <Pa...@uni.no> on 2015/11/05 10:40:50 UTC, 0 replies.
- converting categorical values in csv file to numerical values - posted by "Balachandar R.A." <ba...@gmail.com> on 2015/11/05 10:54:55 UTC, 3 replies.
- very slow parquet file write - posted by Rok Roskar <ro...@gmail.com> on 2015/11/05 11:08:29 UTC, 9 replies.
- Spark task hangs infinitely when accessing S3 from AWS - posted by aecc <al...@gmail.com> on 2015/11/05 11:39:43 UTC, 5 replies.
- Fwd: UnresolvedException - lag, window - posted by Jiří Syrový <sy...@gmail.com> on 2015/11/05 12:13:41 UTC, 0 replies.
- JMX with Spark - posted by Yogesh Vyas <in...@gmail.com> on 2015/11/05 13:08:36 UTC, 3 replies.
- Re: Spark standalone: zookeeper timeout configuration - posted by yueqianzhu <yu...@163.com> on 2015/11/05 13:11:50 UTC, 0 replies.
- Subtract on rdd2 is throwing below exception - posted by Priya Ch <le...@gmail.com> on 2015/11/05 14:32:32 UTC, 1 replies.
- How to use data from Database and reload every hour - posted by Kay-Uwe Moosheimer <Uw...@Moosheimer.com> on 2015/11/05 14:33:24 UTC, 2 replies.
- Spark sql jdbc fails for Oracle NUMBER type columns - posted by Madabhattula Rajesh Kumar <mr...@gmail.com> on 2015/11/05 14:51:29 UTC, 5 replies.
- cartesian in the loop, runtime grows - posted by Faerman Evgeniy <ev...@googlemail.com> on 2015/11/05 16:47:16 UTC, 1 replies.
- Spark using Yarn timelineserver - High CPU usage - posted by Krzysztof Zarzycki <k....@gmail.com> on 2015/11/05 16:57:39 UTC, 0 replies.
- Spark EC2 script on Large clusters - posted by Christian <en...@gmail.com> on 2015/11/05 17:12:42 UTC, 8 replies.
- 101 question on external metastore - posted by Yana Kadiyska <ya...@gmail.com> on 2015/11/05 18:32:04 UTC, 0 replies.
- Re: Building scaladoc using "build/sbt unidoc" failure - posted by vectorijk <ji...@gmail.com> on 2015/11/05 19:12:47 UTC, 0 replies.
- Spark SQL supports operating on a thrift data sources - posted by Jaydeep Vishwakarma <ja...@inmobi.com> on 2015/11/05 20:21:47 UTC, 1 replies.
- Spark Dynamic Partitioning Bug - posted by Bryan Jeffrey <br...@gmail.com> on 2015/11/05 22:15:44 UTC, 0 replies.
- Spark Analytics - posted by Andrés Ivaldi <ia...@gmail.com> on 2015/11/05 22:38:34 UTC, 0 replies.
- Kinesis connection timeout setting on Spark Streaming Kinesis ASL - posted by Hster Geguri <hs...@gmail.com> on 2015/11/05 23:13:42 UTC, 0 replies.
- Re: Spark SQL "SELECT ... LIMIT" scans the entire Hive table? - posted by Jon Gregg <jo...@gmail.com> on 2015/11/06 00:02:19 UTC, 0 replies.
- Guava ClassLoading Issue When Using Different Hive Metastore Version - posted by Joey Paskhay <jo...@gmail.com> on 2015/11/06 00:41:04 UTC, 2 replies.
- Spark Slave always fails to connect to master - posted by أنس الليثي <de...@gmail.com> on 2015/11/06 00:42:53 UTC, 0 replies.
- Spark RDD cache persistence - posted by Deepak Sharma <de...@gmail.com> on 2015/11/06 02:17:44 UTC, 5 replies.
- "Master: got disassociated, removing it." - posted by Khaled Ammar <kh...@gmail.com> on 2015/11/06 06:18:42 UTC, 0 replies.
- Re: Failed to save RDD as text file to local file system - posted by Hitoshi Ozawa <oz...@worksap.co.jp> on 2015/11/06 06:36:53 UTC, 0 replies.
- Unable to register UDF with StructType - posted by Rishabh Bhardwaj <rb...@gmail.com> on 2015/11/06 06:53:59 UTC, 3 replies.
- [Streaming] Long time to catch up when streaming application restarts from checkpoint - posted by Terry Hoo <hu...@gmail.com> on 2015/11/06 09:14:51 UTC, 0 replies.
- ResultStage's parent stages only ShuffleMapStages? - posted by Jacek Laskowski <ja...@japila.pl> on 2015/11/06 09:15:01 UTC, 1 replies.
- [Spark R]could not allocate memory (2048 Mb) in C function 'R_AllocStringBuffer' - posted by Todd <bi...@163.com> on 2015/11/06 10:00:19 UTC, 1 replies.
- Re: Dynamic Allocation & Spark Streaming - posted by Kyle Lin <ky...@gmail.com> on 2015/11/06 10:48:57 UTC, 1 replies.
- Checkpointing an InputDStream from Kafka - posted by Kathi Stutz <em...@kathistutz.de> on 2015/11/06 10:59:20 UTC, 2 replies.
- Re: Get complete row with latest timestamp after a groupBy? - posted by bghit <bo...@gmail.com> on 2015/11/06 11:07:30 UTC, 1 replies.
- Spark Streaming : minimum cores for a Receiver - posted by mpals <ma...@gmail.com> on 2015/11/06 11:42:27 UTC, 1 replies.
- Serializers problems maping RDDs to objects again - posted by Iker Perez de Albeniz <ik...@fon.com> on 2015/11/06 13:44:06 UTC, 1 replies.
- spark 1.5.0 mllib lda eats up all the disk space - posted by "TheGeorge1918 ." <zh...@gmail.com> on 2015/11/06 15:43:58 UTC, 0 replies.
- [sparkR] Any insight on java.lang.OutOfMemoryError: GC overhead limit exceeded - posted by Dhaval Patel <dh...@gmail.com> on 2015/11/06 17:26:19 UTC, 1 replies.
- Is there anyway to do partition discovery without 'field=' in folder names? - posted by Wei Chen <we...@gmail.com> on 2015/11/06 17:44:14 UTC, 0 replies.
- [Spark-SQL]: Disable HiveContext from instantiating in spark-shell - posted by Jerry Lam <ch...@gmail.com> on 2015/11/06 17:53:20 UTC, 10 replies.
- Re: creating a distributed index - posted by swetha kasireddy <sw...@gmail.com> on 2015/11/06 18:02:38 UTC, 0 replies.
- Spark Streaming updateStateByKey Implementation - posted by Hien Luu <hl...@linkedin.com> on 2015/11/06 18:25:07 UTC, 2 replies.
- Spark SQL 'explode' command failing on AWS EC2 but succeeding locally - posted by Anthony Rose <an...@gmail.com> on 2015/11/06 19:27:04 UTC, 0 replies.
- Could not find or load main class org.apache.spark.deploy.yarn.ExecutorLauncher - posted by Kayode Odeyemi <dr...@gmail.com> on 2015/11/06 22:32:48 UTC, 2 replies.
- anyone using netlib-java with sparkR on yarn spark1.6? - posted by Tom Graves <tg...@yahoo.com.INVALID> on 2015/11/06 22:39:52 UTC, 3 replies.
- bug: can not run Ipython notebook on cluster - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2015/11/06 23:18:06 UTC, 1 replies.
- What is the efficient way to Join two RDDs? - posted by swetha <sw...@gmail.com> on 2015/11/07 00:21:49 UTC, 2 replies.
- spark ec2 script doest not install necessary files to launch spark - posted by Emaasit <da...@gmail.com> on 2015/11/07 00:30:11 UTC, 1 replies.
- Spark Job failing with exit status 15 - posted by Shashi Vishwakarma <sh...@gmail.com> on 2015/11/07 16:59:48 UTC, 3 replies.
- streaming: missing data. does saveAsTextFile() append or replace? - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2015/11/07 22:55:19 UTC, 2 replies.
- sqlCtx.sql('some_hive_table') works in pyspark but not spark-submit - posted by YaoPau <jo...@gmail.com> on 2015/11/07 23:12:45 UTC, 1 replies.
- Re: Whether Spark is appropriate for our use case. - posted by Igor Berman <ig...@gmail.com> on 2015/11/08 08:18:11 UTC, 0 replies.
- Connecting SparkR through Yarn - posted by Amit Behera <am...@gmail.com> on 2015/11/08 19:36:29 UTC, 2 replies.
- Broadcast Variables not showing inside Partitions Apache Spark - posted by prajwol sangat <ps...@gmail.com> on 2015/11/08 23:24:20 UTC, 0 replies.
- passing RDDs/DataFrames as arguments to functions - what happens? - posted by Kristina Rogale Plazonic <kp...@gmail.com> on 2015/11/09 02:57:43 UTC, 0 replies.
- How to use --principal and --keytab in SparkSubmit - posted by Todd <bi...@163.com> on 2015/11/09 03:01:54 UTC, 0 replies.
- Re: Is SPARK is the right choice for traditional OLAP query processing? - posted by Hitoshi Ozawa <oz...@worksap.co.jp> on 2015/11/09 05:03:16 UTC, 1 replies.
- Re: How to analyze weather data in Spark? - posted by Hitoshi Ozawa <oz...@worksap.co.jp> on 2015/11/09 05:20:37 UTC, 1 replies.
- Re: why prebuild spark 1.5.1 still say Failed to find Spark assembly in - posted by Hitoshi Ozawa <oz...@worksap.co.jp> on 2015/11/09 05:23:53 UTC, 0 replies.
- Re: visualizations using the apache spark - posted by Hitoshi Ozawa <oz...@worksap.co.jp> on 2015/11/09 05:50:23 UTC, 0 replies.
- Clustering of Words - posted by Deep Pradhan <pr...@gmail.com> on 2015/11/09 06:09:45 UTC, 1 replies.
- OLAP query using spark dataframe with cassandra - posted by "fightfate@163.com" <fi...@163.com> on 2015/11/09 07:02:47 UTC, 11 replies.
- PySpark: cannot convert float infinity to integer, when setting batch in add_shuffle_key - posted by tr...@gmail.com on 2015/11/09 08:32:10 UTC, 0 replies.
- java.lang.ClassNotFoundException: org.apache.spark.streaming.twitter.TwitterReceiver - posted by fanooos <de...@gmail.com> on 2015/11/09 08:36:58 UTC, 7 replies.
- Unwanted SysOuts in Spark Parquet - posted by swetha <sw...@gmail.com> on 2015/11/09 08:40:46 UTC, 2 replies.
- Wrap an RDD with a ShuffledRDD - posted by Muhammad Haseeb Javed <11...@seecs.edu.pk> on 2015/11/09 08:41:48 UTC, 0 replies.
- parquet.io.ParquetEncodingException Warning when trying to save parquet file in Spark - posted by swetha <sw...@gmail.com> on 2015/11/09 08:43:56 UTC, 4 replies.
- Batch Recovering from Checkpoint is taking longer runtime than usual - posted by kundan kumar <ii...@gmail.com> on 2015/11/09 11:51:26 UTC, 0 replies.
- shapely + pyspark - posted by ikeralbeniz <ik...@fon.com> on 2015/11/09 12:07:29 UTC, 0 replies.
- What would happen when reduce memory is not enough on spark shuffle read stage? - posted by JoneZhang <jo...@gmail.com> on 2015/11/09 13:14:22 UTC, 0 replies.
- Re: [SPARK STREAMING ] Sending data to ElasticSearch - posted by Akhil Das <ak...@sigmoidanalytics.com> on 2015/11/09 14:39:01 UTC, 0 replies.
- Re: How to properly read the first number lines of file into a RDD - posted by Akhil Das <ak...@sigmoidanalytics.com> on 2015/11/09 14:51:56 UTC, 1 replies.
- Re: Issue on spark.driver.maxResultSize - posted by Akhil Das <ak...@sigmoidanalytics.com> on 2015/11/09 14:58:31 UTC, 1 replies.
- Re: heap memory - posted by Akhil Das <ak...@sigmoidanalytics.com> on 2015/11/09 15:04:14 UTC, 0 replies.
- Help with serialization - posted by Eyal Sharon <ey...@scene53.com> on 2015/11/09 17:06:41 UTC, 0 replies.
- Kafka Direct does not recover automatically when the Kafka Stream gets messed up? - posted by swetha <sw...@gmail.com> on 2015/11/09 19:26:21 UTC, 1 replies.
- Anybody hit this issue in spark shell? - posted by Zhan Zhang <zz...@hortonworks.com> on 2015/11/09 19:37:31 UTC, 13 replies.
- Re: Kafka Direct does not recover automatically when the Kafka Stream gets messed up? - posted by Cody Koeninger <co...@koeninger.org> on 2015/11/09 20:09:35 UTC, 9 replies.
- status of slaves in standalone cluster rest/rpc call - posted by Igor Berman <ig...@gmail.com> on 2015/11/09 21:41:29 UTC, 1 replies.
- Spark IndexedRDD dependency in Maven - posted by swetha <sw...@gmail.com> on 2015/11/09 22:34:16 UTC, 1 replies.
- Slow stage? - posted by Simone Franzini <ca...@gmail.com> on 2015/11/09 22:52:15 UTC, 3 replies.
- Overriding Derby in hive-site.xml giving strange results... - posted by mayurladwa <ma...@blackrock.com> on 2015/11/10 00:32:52 UTC, 2 replies.
- First project in scala IDE : first problem - posted by didier vila <vi...@hotmail.com> on 2015/11/10 00:39:55 UTC, 1 replies.
- spark shared RDD - posted by Ben <la...@gmail.com> on 2015/11/10 00:45:43 UTC, 2 replies.
- Is it possible Running SparkR on 2 nodes without HDFS - posted by Sanjay Subramanian <sa...@yahoo.com.INVALID> on 2015/11/10 03:06:16 UTC, 2 replies.
- could not see the print out log in spark functions as mapPartitions - posted by Zhiliang Zhu <zc...@yahoo.com.INVALID> on 2015/11/10 04:33:58 UTC, 4 replies.
- Re: kryos serializer - posted by Hitoshi Ozawa <oz...@worksap.co.jp> on 2015/11/10 07:04:17 UTC, 0 replies.
- Why is Kryo not the default serializer? - posted by Hitoshi Ozawa <oz...@worksap.co.jp> on 2015/11/10 07:13:03 UTC, 2 replies.
- Re: java.lang.NoSuchMethodError: org.apache.spark.ui.SparkUI.addStaticHandler(Ljava/lang/String;Ljava/lang/String; - posted by Hitoshi Ozawa <oz...@worksap.co.jp> on 2015/11/10 07:43:21 UTC, 0 replies.
- static spark Function as map - posted by Zhiliang Zhu <zc...@yahoo.com.INVALID> on 2015/11/10 09:12:38 UTC, 0 replies.
- could not understand issue about static spark Function (map / sortBy ...) - posted by Zhiliang Zhu <zc...@yahoo.com.INVALID> on 2015/11/10 10:12:36 UTC, 1 replies.
- NoSuchElementException: key not found - posted by Ankush Khanna <an...@icloud.com> on 2015/11/10 11:37:49 UTC, 2 replies.
- Save to distributed file system from worker processes - posted by "bikash.mnr" <bi...@gmail.com> on 2015/11/10 12:11:24 UTC, 0 replies.
- Re: AnalysisException Handling for unspecified field in Spark SQL - posted by Arvin <ar...@gmail.com> on 2015/11/10 14:25:56 UTC, 0 replies.
- [Yarn] Executor cores isolation - posted by Peter Rudenko <pe...@gmail.com> on 2015/11/10 14:33:17 UTC, 3 replies.
- What are the .snapshot files in /home/spark/Snapshots? - posted by Dmitry Goldenberg <dg...@gmail.com> on 2015/11/10 15:46:02 UTC, 1 replies.
- A question about accumulator - posted by Tan Tim <un...@gmail.com> on 2015/11/10 16:52:29 UTC, 0 replies.
- Re: Spark 1.5 UDAF ArrayType - posted by Alex Nastetsky <al...@vervemobile.com> on 2015/11/10 17:06:19 UTC, 1 replies.
- NullPointerException with joda time - posted by romain sagean <ro...@hupi.fr> on 2015/11/10 17:20:57 UTC, 7 replies.
- [ANNOUNCE] Announcing Spark 1.5.2 - posted by Reynold Xin <rx...@databricks.com> on 2015/11/10 17:49:31 UTC, 1 replies.
- save data as unique file on each slave node - posted by Chuming Chen <ch...@gmail.com> on 2015/11/10 20:12:20 UTC, 0 replies.
- Querying nested struct fields - posted by pratik khadloya <ti...@gmail.com> on 2015/11/10 20:24:11 UTC, 4 replies.
- though experiment: Can I use spark streaming to replace all of my rest services? - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2015/11/10 21:28:33 UTC, 1 replies.
- Re: How to configure logging... - posted by Hitoshi <oz...@worksap.co.jp> on 2015/11/10 22:22:27 UTC, 1 replies.
- Re: SF Spark Office Hours Experiment - Friday Afternoon - posted by Holden Karau <ho...@pigscanfly.ca> on 2015/11/10 22:29:47 UTC, 0 replies.
- thought experiment: use spark ML to real time prediction - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2015/11/10 22:31:31 UTC, 21 replies.
- Re: Using model saved by MLlib with out creating spark context - posted by Viju K <vi...@gmail.com> on 2015/11/10 22:56:24 UTC, 0 replies.
- PySpark: breakdown application execution time and fine-tuning the application - posted by saluc <sa...@usi.ch> on 2015/11/10 22:59:39 UTC, 0 replies.
- Unexpected traffic size between Driver and Worker node ? - posted by Aliaksei Tsyvunchyk <at...@exadel.com> on 2015/11/10 23:02:43 UTC, 0 replies.
- Spark Packages Configuration Not Found - posted by Jakob Odersky <jo...@gmail.com> on 2015/11/10 23:55:36 UTC, 3 replies.
- Python Kafka support? - posted by Darren Govoni <da...@ontrenet.com> on 2015/11/11 00:37:14 UTC, 1 replies.
- Spark SQL reading json with pre-defined schema - posted by "ganesh.tiwari" <ga...@salesforce.com> on 2015/11/11 02:55:53 UTC, 0 replies.
- Spark-csv error on read AWS s3a in spark 1.4.1 - posted by "Zhang, Jingyu" <ji...@news.com.au> on 2015/11/11 03:50:53 UTC, 0 replies.
- Terasort on Spark - posted by "Du, Fan" <fa...@intel.com> on 2015/11/11 06:37:38 UTC, 0 replies.
- Spark Thrift doesn't start - posted by DaeHyun Ryu <ry...@kr.ibm.com> on 2015/11/11 07:47:09 UTC, 2 replies.
- Re: Spark Streaming Checkpoint help failed application - posted by Gideon <gi...@volcanodata.com> on 2015/11/11 09:54:54 UTC, 0 replies.
- Why there's no api for SparkContext#textFiles to support multiple inputs ? - posted by Jeff Zhang <zj...@gmail.com> on 2015/11/11 10:20:48 UTC, 9 replies.
- Start python script with SparkLauncher - posted by Andrejs <an...@insight-centre.org> on 2015/11/11 14:57:26 UTC, 2 replies.
- dynamic allocation w/ spark streaming on mesos? - posted by PhuDuc Nguyen <du...@gmail.com> on 2015/11/11 15:09:15 UTC, 7 replies.
- Re: Spark on YARN using Java 1.8 fails - posted by mvle <mv...@us.ibm.com> on 2015/11/11 17:46:12 UTC, 1 replies.
- Creating new Spark context when running in Secure YARN fails - posted by mvle <mv...@us.ibm.com> on 2015/11/11 19:23:52 UTC, 6 replies.
- Porting R code to SparkR - posted by Sanjay Subramanian <sa...@yahoo.com.INVALID> on 2015/11/11 19:36:40 UTC, 0 replies.
- Spark cluster with Java 8 using ./spark-ec2 - posted by Philipp Grulich <ph...@hotmail.de> on 2015/11/11 21:12:32 UTC, 0 replies.
- Different classpath across stages? - posted by John Meehan <jn...@gmail.com> on 2015/11/11 21:36:42 UTC, 0 replies.
- How can you sort wordcounts by counts in stateful_network_wordcount.py example - posted by Amir Rahnama <am...@gmail.com> on 2015/11/11 22:53:31 UTC, 4 replies.
- Upgrading Spark in EC2 clusters - posted by Augustus Hong <au...@branchmetrics.io> on 2015/11/12 00:00:26 UTC, 3 replies.
- Status of 2.11 support? - posted by shajra-cogscale <sh...@cognitivescale.com> on 2015/11/12 00:22:22 UTC, 3 replies.
- Window's Operations on Spark Partitioned RDD - posted by Alan Braithwaite <al...@cloudflare.com> on 2015/11/12 01:48:54 UTC, 4 replies.
- RE: hdfs-ha on mesos - odd bug - posted by "Buttler, David" <bu...@llnl.gov> on 2015/11/12 02:03:01 UTC, 0 replies.
- graphx - trianglecount of 2B edges - posted by Vinod Mangipudi <vi...@gmail.com> on 2015/11/12 02:53:44 UTC, 1 replies.
- how to run unit test for specific component only - posted by weoccc <we...@gmail.com> on 2015/11/12 04:13:57 UTC, 2 replies.
- Partitioned Parquet based external table - posted by "Chandra Mohan, Ananda Vel Murugan" <An...@honeywell.com> on 2015/11/12 12:38:51 UTC, 4 replies.
- SparkR cannot handle double-byte chacaters - posted by Shige Song <sh...@gmail.com> on 2015/11/12 13:23:23 UTC, 0 replies.
- Issue with Spark-redshift - posted by Hafiz Mujadid <ha...@gmail.com> on 2015/11/12 14:32:40 UTC, 0 replies.
- large, dense matrix multiplication - posted by Eilidh Troup <e....@epcc.ed.ac.uk> on 2015/11/12 14:57:27 UTC, 5 replies.
- Conf Settings in Mesos - posted by John Omernik <jo...@omernik.com> on 2015/11/12 15:05:18 UTC, 2 replies.
- PMML export for Decision Trees - posted by Niki Pavlopoulou <ni...@exonar.com> on 2015/11/12 15:13:35 UTC, 0 replies.
- NPE is Spark Running on Mesos in Finegrained Mode - posted by John Omernik <jo...@omernik.com> on 2015/11/12 15:24:04 UTC, 0 replies.
- metastore_db - posted by Younes Naguib <Yo...@tritondigital.com> on 2015/11/12 16:29:52 UTC, 0 replies.
- In Spark application, how to get the passed in configuration? - posted by java8964 <ja...@hotmail.com> on 2015/11/12 17:21:02 UTC, 2 replies.
- Use existing Hive- and SparkContext with SparkR - posted by Tobias Bockrath <tb...@web-computing.de> on 2015/11/12 17:42:08 UTC, 0 replies.
- Issue with spark on hive - posted by rugalcrimson <64...@qq.com> on 2015/11/12 18:28:09 UTC, 1 replies.
- Checkpointing with Kinesis hangs with socket timeouts when driver is relaunched while transforming on a 0 event batch - posted by Hster Geguri <hs...@gmail.com> on 2015/11/12 19:37:10 UTC, 0 replies.
- Powered by Spark page - posted by Nate Kupp <na...@thumbtack.com> on 2015/11/12 20:55:49 UTC, 0 replies.
- HiveServer2 Thrift OOM - posted by Yana Kadiyska <ya...@gmail.com> on 2015/11/13 01:29:33 UTC, 3 replies.
- Re: spark-1.5.1 application detail ui url - posted by Rastan Boroujerdi <ra...@gmail.com> on 2015/11/13 01:47:32 UTC, 0 replies.
- problem with spark.unsafe.offHeap & spark.sql.tungsten.enabled - posted by tyronecai <ty...@163.com> on 2015/11/13 02:20:05 UTC, 1 replies.
- Kafka Offsets after application is restarted using Spark Streaming Checkpointing - posted by kundan kumar <ii...@gmail.com> on 2015/11/13 11:36:20 UTC, 4 replies.
- Spark and Spring Integrations - posted by Netai Biswas <ma...@gmail.com> on 2015/11/13 11:47:27 UTC, 5 replies.
- Spark Executors off-heap memory usage keeps increasing - posted by Balthasar Schopman <b....@tech.leaseweb.com> on 2015/11/13 11:49:31 UTC, 0 replies.
- Traing data sets storage requirement - posted by "Veluru, Aruna" <Ar...@scientificgames.com> on 2015/11/13 12:15:12 UTC, 0 replies.
- How is the predict() working in LogisticRegressionModel? - posted by MEETHU MATHEW <me...@yahoo.co.in> on 2015/11/13 12:55:43 UTC, 0 replies.
- Stack Overflow Question - posted by Parin Choganwala <pa...@7parkdata.com> on 2015/11/13 14:26:15 UTC, 1 replies.
- Save GraphX to disk - posted by Gaurav Kumar <ga...@gmail.com> on 2015/11/13 15:08:48 UTC, 3 replies.
- spark 1.4 GC issue - posted by Renu Yadav <yr...@gmail.com> on 2015/11/13 15:31:17 UTC, 4 replies.
- Spark Streaming + SparkSQL, time based windowing queries - posted by Saiph Kappa <sa...@gmail.com> on 2015/11/13 15:59:40 UTC, 0 replies.
- No suitable drivers found for postgresql - posted by satish chandra j <js...@gmail.com> on 2015/11/13 16:14:15 UTC, 3 replies.
- Joining HDFS and JDBC data sources - benchmarks - posted by Eran Medan <eh...@gmail.com> on 2015/11/13 17:57:14 UTC, 1 replies.
- a way to allow spark job to continue despite task failures? - posted by Nicolae Marasoiu <ni...@adswizz.com> on 2015/11/13 18:05:50 UTC, 1 replies.
- Please add us to the Powered by Spark page - posted by Sujit Pal <su...@gmail.com> on 2015/11/13 18:21:09 UTC, 4 replies.
- Re: What is difference btw reduce & fold? - posted by firemonk9 <dh...@gmail.com> on 2015/11/13 18:43:53 UTC, 0 replies.
- hang correlated to number of shards Re: Checkpointing with Kinesis hangs with socket timeouts when driver is relaunched while transforming on a 0 event batch - posted by Hster Geguri <hs...@gmail.com> on 2015/11/13 19:13:06 UTC, 0 replies.
- SequenceFile and object reuse - posted by jeff saremi <je...@hotmail.com> on 2015/11/13 19:29:58 UTC, 3 replies.
- Columnar Statisics - posted by sara mustafa <en...@gmail.com> on 2015/11/13 20:57:25 UTC, 0 replies.
- SparkException: Could not read until the end sequence number of the range - posted by Alan Dipert <al...@dipert.org> on 2015/11/13 21:37:19 UTC, 0 replies.
- Join and HashPartitioner question - posted by Alexander Pivovarov <ap...@gmail.com> on 2015/11/13 21:40:48 UTC, 2 replies.
- does spark ML have some thing like createDataPartition() in R caret package ? - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2015/11/13 22:21:26 UTC, 1 replies.
- pyspark sql: number of partitions and partition by size? - posted by Wei Chen <we...@gmail.com> on 2015/11/13 23:13:22 UTC, 0 replies.
- send transformed RDD to s3 from slaves - posted by Walrus theCat <wa...@gmail.com> on 2015/11/14 01:56:53 UTC, 4 replies.
- Spak filestreaming issue - posted by "ravi.gawai" <ra...@gmail.com> on 2015/11/14 02:51:22 UTC, 1 replies.
- out of memory error with Parquet - posted by AlexG <sw...@gmail.com> on 2015/11/14 04:32:50 UTC, 2 replies.
- Re: Spark ClosureCleaner or java serializer OOM when trying to grow - posted by rohangpatil <ro...@gmail.com> on 2015/11/14 09:24:45 UTC, 0 replies.
- Need sth like "def groupByKeyWithRDD(partitioner: Partitioner): RDD[(K, RDD[V])] = ???" - posted by chao chu <ch...@gmail.com> on 2015/11/14 10:02:19 UTC, 0 replies.
- Spark job stuck with 0 input records - posted by pratik khadloya <ti...@gmail.com> on 2015/11/15 02:50:27 UTC, 0 replies.
- Very slow startup for jobs containing millions of tasks - posted by Jerry Lam <ch...@gmail.com> on 2015/11/15 03:35:45 UTC, 3 replies.
- Calculating Timeseries Aggregation - posted by Sandip Mehta <sa...@gmail.com> on 2015/11/15 06:32:26 UTC, 6 replies.
- Spark SQL: filter if column substring does not contain a string - posted by YaoPau <jo...@gmail.com> on 2015/11/15 08:49:35 UTC, 1 replies.
- Data Locality Issue - posted by Renu Yadav <yr...@gmail.com> on 2015/11/15 13:24:41 UTC, 1 replies.
- Yarn Spark on EMR - posted by SURAJ SHETH <sh...@gmail.com> on 2015/11/15 17:19:08 UTC, 1 replies.
- ReduceByKeyAndWindow does repartitioning twice on recovering from checkpoint - posted by kundan kumar <ii...@gmail.com> on 2015/11/15 18:05:27 UTC, 1 replies.
- spark sql "create temporary function" scala functions - posted by Deenar Toraskar <de...@gmail.com> on 2015/11/16 00:23:43 UTC, 0 replies.
- how to get the tracking URL with ip address instead of hostname in yarn-cluster model - posted by wangpan <ya...@gmail.com> on 2015/11/16 03:53:25 UTC, 0 replies.
- Spark-shell connecting to Mesos stuck at sched.cpp - posted by Jong Wook Kim <jo...@nyu.edu> on 2015/11/16 03:59:32 UTC, 1 replies.
- DynamoDB Connector? - posted by Charles Cobb <ch...@seas.upenn.edu> on 2015/11/16 04:00:56 UTC, 1 replies.
- How to passing parameters to another java class - posted by "Zhang, Jingyu" <ji...@news.com.au> on 2015/11/16 05:47:01 UTC, 6 replies.
- Hive on Spark Vs Spark SQL - posted by kiran lonikar <lo...@gmail.com> on 2015/11/16 07:37:17 UTC, 3 replies.
- NoSuchMethodError - posted by Yogesh Vyas <in...@gmail.com> on 2015/11/16 08:02:15 UTC, 5 replies.
- No spark examples jar in maven repository after 1.1.1 ? - posted by Jeff Zhang <zj...@gmail.com> on 2015/11/16 10:27:20 UTC, 2 replies.
- How to enable MetricsServlet sink in Spark 1.5.0? - posted by ihavethepotential <ih...@gmail.com> on 2015/11/16 10:42:23 UTC, 2 replies.
- Size exceeds Integer.MAX_VALUE on EMR 4.0.0 Spark 1.4.1 - posted by "Zhang, Jingyu" <ji...@news.com.au> on 2015/11/16 11:16:34 UTC, 2 replies.
- Re: Size exceeds Integer.MAX_VALUE (SparkSQL$TreeNodeException: sort, tree) on EMR 4.0.0 Spark 1.4.1 - posted by "Zhang, Jingyu" <ji...@news.com.au> on 2015/11/16 11:23:31 UTC, 0 replies.
- Hive on Spark orc file empty - posted by 张炜 <zh...@gmail.com> on 2015/11/16 11:40:59 UTC, 2 replies.
- Spark Expand Cluster - posted by dineshranganathan <di...@gmail.com> on 2015/11/16 13:24:45 UTC, 4 replies.
- spark-submit stuck and no output in console - posted by Kayode Odeyemi <dr...@gmail.com> on 2015/11/16 14:50:49 UTC, 12 replies.
- How 'select name,age from TBL_STUDENT where age = 37' is optimized when caching it - posted by Todd <bi...@163.com> on 2015/11/16 15:35:01 UTC, 0 replies.
- [POWERED BY] Please add our organization - posted by Adrien Mogenet <ad...@contentsquare.com> on 2015/11/16 16:04:05 UTC, 0 replies.
- [Spark-Avro] Question related to the Avro data generated by Spark-Avro - posted by java8964 <ja...@hotmail.com> on 2015/11/16 16:15:04 UTC, 0 replies.
- Spark SQL UDAF works fine locally, OutOfMemory on YARN - posted by Alex Nastetsky <al...@vervemobile.com> on 2015/11/16 18:33:05 UTC, 0 replies.
- Re: How 'select name,age from TBL_STUDENT where age = 37' is optimized when caching it - posted by Xiao Li <ga...@gmail.com> on 2015/11/16 19:05:11 UTC, 0 replies.
- Spark Powered By Page - posted by Alex Rovner <al...@magnetic.com> on 2015/11/16 20:17:52 UTC, 0 replies.
- how can evenly distribute my records in all partition - posted by prateek arora <pr...@gmail.com> on 2015/11/16 20:41:37 UTC, 6 replies.
- YARN Labels - posted by Alex Rovner <al...@magnetic.com> on 2015/11/16 21:52:18 UTC, 4 replies.
- [SPARK STREAMING] Questions regarding foreachPartition - posted by Nipun Arora <ni...@gmail.com> on 2015/11/16 23:02:44 UTC, 2 replies.
- Parallelizing operations using Spark - posted by Susheel Kumar <su...@gmail.com> on 2015/11/16 23:44:16 UTC, 2 replies.
- Re: Spark Implementation of XGBoost - posted by Joseph Bradley <jo...@databricks.com> on 2015/11/17 00:54:02 UTC, 0 replies.
- Stage retry limit - posted by pnpritchard <ni...@falkonry.com> on 2015/11/17 02:32:17 UTC, 0 replies.
- Mesos cluster dispatcher doesn't respect most args from the submit req - posted by Jo Voordeckers <jo...@gmail.com> on 2015/11/17 02:46:36 UTC, 4 replies.
- Spark Job is getting killed after certain hours - posted by Nikhil Gs <gs...@gmail.com> on 2015/11/17 03:00:54 UTC, 5 replies.
- ISDATE Function - posted by Ravisankar Mani <rr...@gmail.com> on 2015/11/17 07:46:03 UTC, 3 replies.
- Reading non UTF-8 files via spark streaming - posted by tarek_abouzeid <ta...@yahoo.com> on 2015/11/17 08:13:38 UTC, 0 replies.
- Avro RDD to DataFrame - posted by Deenar Toraskar <de...@gmail.com> on 2015/11/17 08:30:59 UTC, 0 replies.
- Off-heap memory usage of Spark Executors keeps increasing - posted by "b.schopman" <b....@tech.leaseweb.com> on 2015/11/17 09:31:58 UTC, 0 replies.
- zeppelin (or spark-shell) with HBase fails on executor level - posted by 임정택 <ka...@gmail.com> on 2015/11/17 10:01:00 UTC, 11 replies.
- synchronizing streams of different kafka topics - posted by Antony Mayi <an...@yahoo.com.INVALID> on 2015/11/17 10:07:29 UTC, 1 replies.
- Count of streams processed - posted by "Chandra Mohan, Ananda Vel Murugan" <An...@honeywell.com> on 2015/11/17 11:48:00 UTC, 1 replies.
- Distinct on key-value pair of JavaRDD - posted by Ramkumar V <ra...@gmail.com> on 2015/11/17 12:00:10 UTC, 1 replies.
- Working with RDD from Java - posted by frula00 <iv...@crossing-technologies.com> on 2015/11/17 12:06:22 UTC, 2 replies.
- How to create nested structure from RDD - posted by fhussonnois <fh...@gmail.com> on 2015/11/17 18:06:48 UTC, 2 replies.
- Issue while Spark Job fetching data from Cassandra DB - posted by satish chandra j <js...@gmail.com> on 2015/11/17 19:15:15 UTC, 4 replies.
- Is there a way to delete task history besides using a ttl? - posted by Jonathan Coveney <jc...@gmail.com> on 2015/11/17 20:45:28 UTC, 1 replies.
- Any way to get raw score from MultilayerPerceptronClassificationModel ? - posted by Robert Dodier <ro...@gmail.com> on 2015/11/17 22:38:14 UTC, 2 replies.
- Invocation of StreamingContext.stop() hangs in 1.5 - posted by jiten <ji...@gmail.com> on 2015/11/17 22:50:47 UTC, 2 replies.
- WARN LoadSnappy: Snappy native library not loaded - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2015/11/17 23:22:48 UTC, 4 replies.
- pyspark ML pipeline with shared data - posted by Dominik Dahlem <do...@gmail.com> on 2015/11/18 00:10:15 UTC, 0 replies.
- Spark LogisticRegression returns scaled coefficients - posted by njoshi <ni...@teamaol.com> on 2015/11/18 01:11:01 UTC, 3 replies.
- RE: TightVNC - Application Monitor (right pane) - posted by Tim Barthram <Ti...@iag.com.au> on 2015/11/18 01:44:21 UTC, 0 replies.
- Streaming Job gives error after changing to version 1.5.2 - posted by swetha <sw...@gmail.com> on 2015/11/18 02:34:21 UTC, 9 replies.
- kafka streaminf 1.5.2 - ClassNotFoundException: org.apache.spark.streaming.kafka.KafkaReceiver - posted by tim_b123 <ti...@iag.com.au> on 2015/11/18 03:04:28 UTC, 1 replies.
- Spark build error - posted by 金国栋 <sc...@gmail.com> on 2015/11/18 03:45:36 UTC, 6 replies.
- Incorrect results with reduceByKey - posted by tovbinm <to...@gmail.com> on 2015/11/18 05:28:59 UTC, 2 replies.
- spark with breeze error of NoClassDefFoundError - posted by Jack Yang <ji...@uow.edu.au> on 2015/11/18 05:33:35 UTC, 11 replies.
- Additional Master daemon classpath - posted by Michal Klos <mi...@gmail.com> on 2015/11/18 06:06:35 UTC, 3 replies.
- How to return a pair RDD from an RDD that has foreachPartition applied? - posted by swetha <sw...@gmail.com> on 2015/11/18 08:51:56 UTC, 2 replies.
- subscribe - posted by Alex Luya <al...@gmail.com> on 2015/11/18 10:19:16 UTC, 0 replies.
- (send this email to subscribe) - posted by Alex Luya <al...@gmail.com> on 2015/11/18 10:23:28 UTC, 1 replies.
- How to disable SparkUI programmatically? - posted by Alex Luya <al...@gmail.com> on 2015/11/18 10:29:58 UTC, 2 replies.
- Shuffle FileNotFound Exception - posted by Tom Arnfeld <to...@duedil.com> on 2015/11/18 13:00:47 UTC, 3 replies.
- orc read issue n spark - posted by Renu Yadav <yr...@gmail.com> on 2015/11/18 14:51:21 UTC, 1 replies.
- Unable to load native-hadoop library for your platform - already loaded in another classloader - posted by Deenar Toraskar <de...@gmail.com> on 2015/11/18 18:50:59 UTC, 0 replies.
- Spark+Groovy: java.lang.ClassNotFoundException: org.apache.spark.rpc.akka.AkkaRpcEnvFactory - posted by tog <gu...@gmail.com> on 2015/11/18 19:33:46 UTC, 0 replies.
- Spark Summit East 2016 CFP - Closing in 5 days - posted by Scott walent <sc...@gmail.com> on 2015/11/18 19:58:45 UTC, 0 replies.
- Unable to import SharedSparkContext - posted by njoshi <ni...@teamaol.com> on 2015/11/18 20:08:56 UTC, 4 replies.
- GraphX stopped without finishing and with no ERRORs ! - posted by Khaled Ammar <kh...@gmail.com> on 2015/11/18 20:35:39 UTC, 0 replies.
- DataFrames initial jdbc loading - will it be utilizing a filter predicate? - posted by Eran Medan <er...@gmail.com> on 2015/11/18 20:36:44 UTC, 2 replies.
- newbie simple app, small data set: Py4JJavaError java.lang.OutOfMemoryError: GC overhead limit exceeded - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2015/11/18 21:12:22 UTC, 0 replies.
- Apache Groovy and Spark - posted by tog <gu...@gmail.com> on 2015/11/18 21:35:59 UTC, 3 replies.
- getting different results from same line of code repeated - posted by Walrus theCat <wa...@gmail.com> on 2015/11/18 23:52:23 UTC, 3 replies.
- unsubscribe - posted by VJ Anand <vj...@sankia.com> on 2015/11/19 00:24:52 UTC, 0 replies.
- how to group timestamp data and filter on it - posted by Cassa L <lc...@gmail.com> on 2015/11/19 01:26:32 UTC, 1 replies.
- Re: Spark job workflow engine recommendations - posted by Vikram Kone <vi...@gmail.com> on 2015/11/19 02:47:30 UTC, 3 replies.
- DataFrame.insertIntoJDBC throws AnalysisException -- cannot save - posted by jonpowell <jo...@mir3.com> on 2015/11/19 03:28:45 UTC, 0 replies.
- Do windowing functions require hive support? - posted by Stephen Boesch <ja...@gmail.com> on 2015/11/19 04:11:42 UTC, 6 replies.
- How to clear the temp files that gets created by shuffle in Spark Streaming - posted by swetha <sw...@gmail.com> on 2015/11/19 04:28:19 UTC, 2 replies.
- Spark JDBCRDD query - posted by sunil m <26...@gmail.com> on 2015/11/19 04:43:54 UTC, 0 replies.
- Spark twitter streaming in Java - posted by Soni spark <so...@gmail.com> on 2015/11/19 08:04:59 UTC, 1 replies.
- Spark Monitoring to get Spark GCs and records processed - posted by rakesh rakshit <ih...@gmail.com> on 2015/11/19 08:30:25 UTC, 0 replies.
- [SPARK STREAMING] multiple hosts and multiple ports for Stream job - posted by diplomatic Guru <di...@gmail.com> on 2015/11/19 09:35:48 UTC, 0 replies.
- (Unknown) - posted by aman solanki <yo...@gmail.com> on 2015/11/19 09:53:09 UTC, 1 replies.
- Re: - posted by Chintan Bhatt <ch...@charusat.ac.in> on 2015/11/19 09:56:48 UTC, 2 replies.
- Reading from RabbitMq via Apache Spark Streaming - posted by D <su...@gmail.com> on 2015/11/19 10:02:17 UTC, 2 replies.
- SparkSQL optimizations and Spark streaming - posted by "Sela, Amit" <AN...@paypal.com.INVALID> on 2015/11/19 10:17:26 UTC, 1 replies.
- Re: dounbts on parquet - posted by Cheng Lian <li...@gmail.com> on 2015/11/19 10:24:19 UTC, 1 replies.
- ClassNotFound for exception class in Spark 1.5.x - posted by Zsolt Tóth <to...@gmail.com> on 2015/11/19 11:02:24 UTC, 2 replies.
- 回复：Re: driver ClassNotFoundException when MySQL JDBC exceptions are thrown on executor - posted by lu...@sina.com on 2015/11/19 11:03:06 UTC, 0 replies.
- Re: Re: driver ClassNotFoundException when MySQL JDBC exceptions are thrown on executor - posted by Jeff Zhang <zj...@gmail.com> on 2015/11/19 11:20:39 UTC, 1 replies.
- Why Spark Streaming keeps all batches in memory after processing? - posted by Artem Moskvin <mo...@gmail.com> on 2015/11/19 11:45:20 UTC, 0 replies.
- 回复：Re: Re: driver ClassNotFoundException when MySQL JDBC exceptions are thrown on executor - posted by lu...@sina.com on 2015/11/19 11:52:28 UTC, 0 replies.
- FastUtil DataStructures in Spark - posted by swetha <sw...@gmail.com> on 2015/11/19 12:06:18 UTC, 0 replies.
- Error not found value sqlContext - posted by satish chandra j <js...@gmail.com> on 2015/11/19 13:19:37 UTC, 6 replies.
- Moving avg in saprk streaming - posted by anshu shukla <an...@gmail.com> on 2015/11/19 13:28:50 UTC, 0 replies.
- StackOverflowError while iterates model.freqItemsets - posted by ericyang <it...@itri.org.tw> on 2015/11/19 14:56:44 UTC, 0 replies.
- Receiver stage fails but Spark application stands RUNNING - posted by Pierre Van Ingelandt <pi...@airfrance.fr> on 2015/11/19 15:47:10 UTC, 0 replies.
- Spark streaming and custom partitioning - posted by Sachin Mousli <sa...@gmx.com> on 2015/11/19 15:58:35 UTC, 1 replies.
- Spark 1.5.3 release - posted by Madabhattula Rajesh Kumar <mr...@gmail.com> on 2015/11/19 16:52:53 UTC, 0 replies.
- PySpark Lost Executors - posted by Ro...@thomsonreuters.com on 2015/11/19 17:02:18 UTC, 5 replies.
- create a table for csv files - posted by xiaohe lan <zo...@gmail.com> on 2015/11/19 17:23:37 UTC, 1 replies.
- Want 1-1 map between input files and output files in map-only job - posted by Arun Luthra <ar...@gmail.com> on 2015/11/19 17:28:54 UTC, 0 replies.
- what is algorithm to optimize function with nonlinear constraints - posted by Zhiliang Zhu <zc...@yahoo.com.INVALID> on 2015/11/19 18:42:33 UTC, 0 replies.
- Shuffle performance tuning. How to tune netty? - posted by t3l <t3...@threelights.de> on 2015/11/19 19:01:25 UTC, 0 replies.
- External Table not getting updated from parquet files written by spark streaming - posted by Abhishek Anand <ab...@gmail.com> on 2015/11/19 20:18:57 UTC, 1 replies.
- Blocked REPL commands - posted by Jakob Odersky <jo...@gmail.com> on 2015/11/19 20:44:51 UTC, 2 replies.
- spark-submit is throwing NPE when trying to submit a random forest model - posted by Rachana Srivastava <Ra...@markmonitor.com> on 2015/11/19 22:21:35 UTC, 1 replies.
- Configuring Log4J (Spark 1.5 on EMR 4.1) - posted by "Afshartous, Nick" <na...@turbine.com> on 2015/11/19 22:30:44 UTC, 3 replies.
- Spark Tasks on second node never return in Yarn when I have more than 1 task node - posted by Shuai Zheng <sz...@gmail.com> on 2015/11/19 22:32:59 UTC, 2 replies.
- Drop multiple columns in the DataFrame API - posted by Benjamin Fradet <be...@gmail.com> on 2015/11/19 23:15:50 UTC, 2 replies.
- newbie: unable to use all my cores and memory - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2015/11/20 01:02:45 UTC, 0 replies.
- question about combining small input splits - posted by Nezih Yigitbasi <ny...@netflix.com.INVALID> on 2015/11/20 02:28:03 UTC, 1 replies.
- Re: python.worker.memory parameter - posted by Ted Yu <yu...@gmail.com> on 2015/11/20 03:03:21 UTC, 0 replies.
- spark streaming problem saveAsTextFiles() does not write valid JSON to HDFS - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2015/11/20 03:58:46 UTC, 1 replies.
- Failed to execute goal net.alchim31.maven:scala-maven-plugin:3.2.2 when compile spark-1.5.2 - posted by "ck032@126.com" <ck...@126.com> on 2015/11/20 04:10:44 UTC, 0 replies.
- SparkR DataFrame , Out of memory exception for very small file. - posted by vipulrai <vi...@gmail.com> on 2015/11/20 05:10:39 UTC, 7 replies.
- has any spark write orc document - posted by zhangjp <59...@qq.com> on 2015/11/20 07:59:39 UTC, 2 replies.
- 回复： has any spark write orc document - posted by zhangjp <59...@qq.com> on 2015/11/20 08:47:07 UTC, 1 replies.
- Question About Task Number In A spark Stage - posted by Gerald-G <sh...@gmail.com> on 2015/11/20 10:23:10 UTC, 0 replies.
- spark-shell issue Job in illegal state & sparkcontext not serializable - posted by "Balachandar R.A." <ba...@gmail.com> on 2015/11/20 11:30:09 UTC, 0 replies.
- How to control number of parquet files generated when using partitionBy - posted by glennie <gs...@audienceproject.com> on 2015/11/20 12:03:41 UTC, 1 replies.
- 回复：RE: Error not found value sqlContext - posted by prosp4300 <pr...@163.com> on 2015/11/20 13:16:00 UTC, 0 replies.
- Spark Streaming - stream between 2 applications - posted by Saiph Kappa <sa...@gmail.com> on 2015/11/20 14:48:52 UTC, 4 replies.
- Re: newbie: unable to use all my cores and memory - posted by Igor Berman <ig...@gmail.com> on 2015/11/20 15:22:07 UTC, 1 replies.
- Data in one partition after reduceByKey - posted by Patrick McGloin <mc...@gmail.com> on 2015/11/20 17:17:21 UTC, 2 replies.
- how to use sc.hadoopConfiguration from pyspark - posted by Tamas Szuromi <ta...@odigeo.com> on 2015/11/20 17:23:11 UTC, 3 replies.
- Hive error after update from 1.4.1 to 1.5.2 - posted by Bryan Jeffrey <br...@gmail.com> on 2015/11/20 20:13:32 UTC, 2 replies.
- Does spark streaming write ahead log writes all received data to HDFS ? - posted by "Kali.tummala@gmail.com" <Ka...@gmail.com> on 2015/11/20 21:26:25 UTC, 1 replies.
- Corelation between 2 consecutive RDDs in Dstream - posted by anshu shukla <an...@gmail.com> on 2015/11/20 22:17:05 UTC, 0 replies.
- FW: starting spark-shell throws /tmp/hive on HDFS should be writable error - posted by Mich Talebzadeh <mi...@peridale.co.uk> on 2015/11/20 23:43:21 UTC, 0 replies.
- How to run two operations on the same RDD simultaneously - posted by jluan <ja...@gmail.com> on 2015/11/21 00:38:43 UTC, 2 replies.
- Re: updateStateByKey schedule time - posted by Tathagata Das <td...@databricks.com> on 2015/11/21 01:05:27 UTC, 0 replies.
- How to kill spark applications submitted using spark-submit reliably? - posted by Vikram Kone <vi...@gmail.com> on 2015/11/21 03:46:28 UTC, 11 replies.
- Initial State - posted by Bryan <br...@gmail.com> on 2015/11/21 03:52:28 UTC, 2 replies.
- Error in Saving the MLlib models - posted by hokam chauhan <ho...@gmail.com> on 2015/11/21 08:30:00 UTC, 0 replies.
- How to kill the spark job using Java API. - posted by Hokam Singh Chauhan <ho...@gmail.com> on 2015/11/21 08:35:59 UTC, 0 replies.
- Spark : merging object with approximation - posted by OcterA <mr...@gmail.com> on 2015/11/21 11:24:47 UTC, 1 replies.
- Spark-SQL idiomatic way of adding a new partition or writing to Partitioned Persistent Table - posted by Deenar Toraskar <de...@gmail.com> on 2015/11/21 11:25:14 UTC, 3 replies.
- How to adjust Spark shell table width - posted by Fengdong Yu <fe...@everstring.com> on 2015/11/21 15:24:43 UTC, 2 replies.
- RDD partition after calling mapToPair - posted by trung kien <ki...@gmail.com> on 2015/11/21 18:46:08 UTC, 4 replies.
- spark shuffle - posted by Shushant Arora <sh...@gmail.com> on 2015/11/21 18:51:00 UTC, 1 replies.
- newbie : why are thousands of empty files being created on HDFS? - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2015/11/21 19:29:31 UTC, 8 replies.
- Closures sent once per executor or copied with each tasks? - posted by emao <io...@gmail.com> on 2015/11/21 21:14:10 UTC, 0 replies.
- Datastore for GrpahX - posted by Ilango Ravi <ra...@gmail.com> on 2015/11/21 22:08:15 UTC, 1 replies.
- JavaStreamingContext nullpointer exception while fetching data from Cassandra - posted by "ravi.gawai" <ra...@gmail.com> on 2015/11/22 03:03:13 UTC, 0 replies.
- [Spark Streaming] Unable to write checkpoint when restart - posted by Sea <26...@qq.com> on 2015/11/22 06:19:30 UTC, 0 replies.
- Need Help Diagnosing/operating/tuning - posted by Jeremy Davis <jd...@marketshare.com> on 2015/11/22 21:38:39 UTC, 2 replies.
- A Problem About Running Spark 1.5 on YARN with Dynamic Alloction - posted by 谢廷稳 <xi...@gmail.com> on 2015/11/23 13:41:35 UTC, 0 replies.
- Re: A Problem About Running Spark 1.5 on YARN with Dynamic Alloction - posted by Saisai Shao <sa...@gmail.com> on 2015/11/23 14:00:20 UTC, 16 replies.
- DateTime Support - Hive Parquet - posted by Bryan Jeffrey <br...@gmail.com> on 2015/11/23 14:40:48 UTC, 5 replies.
- A question about sql clustering - posted by Cesar Flores <ce...@gmail.com> on 2015/11/23 16:27:47 UTC, 0 replies.
- spark 1.4.1 to oracle 11g write to an existing table - posted by Siva Gudavalli <gs...@gmail.com> on 2015/11/23 17:57:10 UTC, 0 replies.
- Getting different DESCRIBE results between SparkSQL and Hive - posted by YaoPau <jo...@gmail.com> on 2015/11/23 18:46:06 UTC, 0 replies.
- Spark Kafka Direct Error - posted by swetha <sw...@gmail.com> on 2015/11/23 19:57:03 UTC, 6 replies.
- how to us DataFrame.na.fill based on condition - posted by Vishnu Viswanath <vi...@gmail.com> on 2015/11/23 20:05:05 UTC, 3 replies.
- Dataframe constructor - posted by spark_user_2015 <li...@adobe.com> on 2015/11/23 20:09:18 UTC, 1 replies.
- spark-csv on Amazon EMR - posted by Daniel Lopes <da...@bankfacil.com.br> on 2015/11/23 21:16:47 UTC, 0 replies.
- Add Data Science Serbia meetup - posted by Darko Marjanovic <da...@thingsolver.com> on 2015/11/23 23:34:31 UTC, 0 replies.
- Any workaround for Kafka couldn't find leaders for set? - posted by Hudong Wang <ju...@hotmail.com> on 2015/11/23 23:54:46 UTC, 1 replies.
- Relation between RDDs, DataFrames and Project Tungsten - posted by Jakob Odersky <jo...@gmail.com> on 2015/11/24 01:18:49 UTC, 4 replies.
- spark-ec2 script to launch cluster running Spark 1.5.2 built with HIVE? - posted by Jeff Schecter <je...@levelmoney.com> on 2015/11/24 01:48:43 UTC, 2 replies.
- How to have Kafka Direct Consumers show up in Kafka Consumer reporting? - posted by swetha <sw...@gmail.com> on 2015/11/24 02:37:30 UTC, 4 replies.
- Port Control for YARN-Aware Spark - posted by gpriestley <gr...@gmail.com> on 2015/11/24 03:24:53 UTC, 1 replies.
- Apache Cassandra Docker Images? - posted by Renato Perini <re...@gmail.com> on 2015/11/24 03:44:22 UTC, 0 replies.
- Spark Streaming idempotent writes to HDFS - posted by Michael <mf...@fastest.cc> on 2015/11/24 05:58:53 UTC, 3 replies.
- load multiple directory using dataframe load - posted by Renu Yadav <yr...@gmail.com> on 2015/11/24 06:07:47 UTC, 1 replies.
- Hi, how can I use BigInt/Long as type of size in Vector.sparse? - posted by "alexanderwu (吴承霖)" <al...@tencent.com> on 2015/11/24 09:33:39 UTC, 0 replies.
- Getting ParquetDecodingException when I am running my spark application from spark-submit - posted by Kapil Raaj <ca...@gmail.com> on 2015/11/24 10:09:13 UTC, 0 replies.
- indexedrdd and radix tree: how to search indexedRDD using all prefixes? - posted by Mina <se...@yahoo.com> on 2015/11/24 10:36:10 UTC, 1 replies.
- [streaming] KafkaUtils.createDirectStream - how to start streming from checkpoints? - posted by ponkin <al...@ya.ru> on 2015/11/24 10:46:16 UTC, 2 replies.
- Spark 1.6 Build - posted by Madabhattula Rajesh Kumar <mr...@gmail.com> on 2015/11/24 13:16:45 UTC, 7 replies.
- Experiences about NoSQL databases with Spark - posted by sparkuser2345 <hm...@gmail.com> on 2015/11/24 13:46:10 UTC, 4 replies.
- Spark SQL Save CSV with JSON Column - posted by Ro...@thomsonreuters.com on 2015/11/24 16:36:05 UTC, 1 replies.
- Is it relevant to use BinaryClassificationMetrics.aucROC / aucPR with LogisticRegressionModel ? - posted by jmvllt <mo...@gmail.com> on 2015/11/24 18:19:52 UTC, 3 replies.
- Getting the batch time of the active batches in spark streaming - posted by Abhishek Anand <ab...@gmail.com> on 2015/11/24 22:50:50 UTC, 2 replies.
- Spark sql-1.4.1 DataFrameWrite.jdbc() SaveMode.Append - posted by Siva Gudavalli <gs...@gmail.com> on 2015/11/24 23:04:43 UTC, 0 replies.
- Adding more slaves to a running cluster - posted by Dillian Murphey <cr...@gmail.com> on 2015/11/25 01:40:07 UTC, 3 replies.
- java.io.IOException when using KryoSerializer - posted by Piero Cinquegrana <pc...@marketshare.com> on 2015/11/25 05:02:56 UTC, 0 replies.
- Does receiver based approach lose any data in case of a leader/broker loss in Spark Streaming? - posted by SRK <sw...@gmail.com> on 2015/11/25 05:55:47 UTC, 1 replies.
- queries on Spork (Pig on Spark) - posted by Divya Gehlot <di...@gmail.com> on 2015/11/25 06:57:12 UTC, 2 replies.
- Why does a 3.8 T dataset take up 11.59 Tb on HDFS - posted by AlexG <sw...@gmail.com> on 2015/11/25 07:31:34 UTC, 6 replies.
- Conversely, Hive is performing better than Spark-Sql - posted by UMESH CHAUDHARY <um...@gmail.com> on 2015/11/25 08:00:19 UTC, 1 replies.
- Spark 1.4.2- java.io.FileNotFoundException: Job aborted due to stage failure - posted by Sahil Sareen <sa...@gmail.com> on 2015/11/25 08:06:25 UTC, 0 replies.
- Fwd: pyspark: Error when training a GMM with an initial GaussianMixtureModel - posted by Guillaume Maze <ma...@gmail.com> on 2015/11/25 10:13:23 UTC, 0 replies.
- locality level counter - posted by Patcharee Thongtra <Pa...@uni.no> on 2015/11/25 12:37:42 UTC, 0 replies.
- [Spark Streaming] How to clear old data from Stream State? - posted by diplomatic Guru <di...@gmail.com> on 2015/11/25 14:00:10 UTC, 2 replies.
- Spark, Windows 7 python shell non-reachable ip address - posted by Shuo Wang <sh...@gmail.com> on 2015/11/25 14:37:23 UTC, 3 replies.
- Spark Driver Port Details - posted by aman solanki <yo...@gmail.com> on 2015/11/25 15:15:00 UTC, 1 replies.
- JNI native linrary problem java.lang.UnsatisfiedLinkError - posted by Oriol López Massaguer <or...@gmail.com> on 2015/11/25 16:20:25 UTC, 0 replies.
- Spark 1.5.2 JNI native library java.lang.UnsatisfiedLinkError - posted by Oriol López Massaguer <or...@gmail.com> on 2015/11/25 16:32:07 UTC, 2 replies.
- Partial data transfer from one partition to other - posted by Samarth Rastogi <sr...@thunderhead.com> on 2015/11/25 17:31:18 UTC, 0 replies.
- data local read counter - posted by Patcharee Thongtra <Pa...@uni.no> on 2015/11/25 17:37:02 UTC, 0 replies.
- sc.textFile() does not count lines properly? - posted by George Sigletos <si...@textkernel.nl> on 2015/11/25 18:06:45 UTC, 1 replies.
- [ANNOUNCE] CFP open for ApacheCon North America 2016 - posted by Rich Bowen <rb...@rcbowen.com> on 2015/11/25 18:32:10 UTC, 0 replies.
- Automatic driver restart does not seem to be working in Spark Standalone - posted by SRK <sw...@gmail.com> on 2015/11/25 19:46:10 UTC, 5 replies.
- Queue in Spark standalone mode - posted by sunil m <26...@gmail.com> on 2015/11/25 20:29:23 UTC, 1 replies.
- UDF with 2 arguments - posted by Daniel Lopes <da...@bankfacil.com.br> on 2015/11/25 21:01:19 UTC, 2 replies.
- How to gracefully shutdown a Spark Streaming Job? - posted by SRK <sw...@gmail.com> on 2015/11/25 22:16:34 UTC, 1 replies.
- Spark- Cassandra Connector Error - posted by ahlusar <ah...@gmail.com> on 2015/11/25 22:27:16 UTC, 0 replies.
- Building Spark without hive libraries - posted by Mich Talebzadeh <mi...@peridale.co.uk> on 2015/11/25 22:30:00 UTC, 7 replies.
- Adding new column to Dataframe - posted by Vishnu Viswanath <vi...@gmail.com> on 2015/11/25 23:39:04 UTC, 5 replies.
- Error in block pushing thread puts the KinesisReceiver in a stuck state - posted by Spark Newbie <sp...@gmail.com> on 2015/11/26 00:56:19 UTC, 1 replies.
- SparkException: Failed to get broadcast_10_piece0 - posted by Spark Newbie <sp...@gmail.com> on 2015/11/26 00:59:17 UTC, 3 replies.
- send this email to unsubscribe - posted by "ngocan211 ." <ng...@gmail.com> on 2015/11/26 03:21:56 UTC, 0 replies.
- Obtaining Job Id for query submitted via Spark Thrift Server - posted by Jagrut Sharma <ja...@gmail.com> on 2015/11/26 04:55:39 UTC, 2 replies.
- log4j custom appender ClassNotFoundException with spark 1.5.2 - posted by lev <ka...@gmail.com> on 2015/11/26 06:23:33 UTC, 0 replies.
- JavaPairRDD.treeAggregate - posted by amit tewari <am...@gmail.com> on 2015/11/26 06:34:57 UTC, 0 replies.
- Re: Spark REST Job server feedback? - posted by Deenar Toraskar <de...@gmail.com> on 2015/11/26 08:04:30 UTC, 1 replies.
- Spark on Mesos with Centos 6.6 NFS - posted by leonidas <mo...@gmail.com> on 2015/11/26 08:52:21 UTC, 0 replies.
- controlling parquet file sizes for faster transfer to S3 from HDFS - posted by AlexG <sw...@gmail.com> on 2015/11/26 09:05:41 UTC, 0 replies.
- starting start-master.sh throws "java.lang.ClassNotFoundException: org.slf4j.Logger" error - posted by Mich Talebzadeh <mi...@peridale.co.uk> on 2015/11/26 09:58:19 UTC, 0 replies.
- Re: RE: Spark checkpoint problem - posted by eric wong <wi...@gmail.com> on 2015/11/26 10:20:50 UTC, 0 replies.
- ClassNotFoundException with a uber jar. - posted by Marc de Palol <ph...@gmail.com> on 2015/11/26 11:49:12 UTC, 1 replies.
- java.io.FileNotFoundException: Job aborted due to stage failure - posted by Sahil Sareen <sa...@gmail.com> on 2015/11/26 12:03:39 UTC, 1 replies.
- custom inputformat recordreader - posted by Patcharee Thongtra <Pa...@uni.no> on 2015/11/26 13:50:26 UTC, 1 replies.
- MySQLSyntaxErrorException when connect hive to sparksql - posted by lu...@sina.com on 2015/11/26 14:26:03 UTC, 1 replies.
- Help with Couchbase connector error - posted by Eyal Sharon <ey...@scene53.com> on 2015/11/26 16:31:24 UTC, 5 replies.
- building spark from 1.3 release without Hive - posted by Mich Talebzadeh <mi...@peridale.co.uk> on 2015/11/26 17:31:23 UTC, 0 replies.
- question about combining small parquet files - posted by Nezih Yigitbasi <ny...@netflix.com.INVALID> on 2015/11/26 18:43:38 UTC, 2 replies.
- possible bug spark/python/pyspark/rdd.py portable_hash() - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2015/11/26 20:30:09 UTC, 6 replies.
- Grid search with Random Forest - posted by Ndjido Ardo Bar <nd...@gmail.com> on 2015/11/26 21:53:05 UTC, 0 replies.
- Unable to use "Batch Start Time" on worker nodes. - posted by Abhishek Anand <ab...@gmail.com> on 2015/11/26 22:33:13 UTC, 2 replies.
- Stop Spark yarn-client job - posted by Jagat Singh <ja...@gmail.com> on 2015/11/27 01:10:09 UTC, 1 replies.
- Millions of entities in custom Hadoop InputFormat and broadcast variable - posted by Anfernee Xu <an...@gmail.com> on 2015/11/27 07:06:01 UTC, 1 replies.
- Optimizing large collect operations - posted by Gylfi <gy...@berkeley.edu> on 2015/11/27 07:41:53 UTC, 1 replies.
- error while creating HiveContext - posted by "Chandra Mohan, Ananda Vel Murugan" <An...@honeywell.com> on 2015/11/27 08:04:40 UTC, 4 replies.
- Re: GraphX - How to make a directed graph an undirected graph? - posted by Robineast <Ro...@xense.co.uk> on 2015/11/27 08:06:17 UTC, 0 replies.
- Spark on yarn vs spark standalone - posted by cs user <ac...@gmail.com> on 2015/11/27 08:36:18 UTC, 4 replies.
- how to using local repository in spark[dev] - posted by lihu <li...@gmail.com> on 2015/11/27 09:03:33 UTC, 1 replies.
- Re: WARN MemoryStore: Not enough space - posted by Gylfi <gy...@berkeley.edu> on 2015/11/27 09:56:43 UTC, 0 replies.
- Permanent RDD growing with Kafka DirectStream - posted by "Uwe@Moosheimer.com" <Uw...@Moosheimer.com> on 2015/11/27 11:25:07 UTC, 1 replies.
- Hive using Spark engine alone - posted by Mich Talebzadeh <mi...@peridale.co.uk> on 2015/11/27 11:43:47 UTC, 2 replies.
- In yarn-client mode, is it the driver or application master that issue commands to executors? - posted by Nisrina Luthfiyati <ni...@gmail.com> on 2015/11/27 12:12:02 UTC, 3 replies.
- Cant start master on windows 7 - posted by Shuo Wang <sh...@gmail.com> on 2015/11/27 16:27:56 UTC, 4 replies.
- Spark Streaming on mesos - posted by Renjie Liu <li...@gmail.com> on 2015/11/27 17:27:44 UTC, 6 replies.
- Windows shared folder - posted by Shuo Wang <sh...@gmail.com> on 2015/11/27 23:29:11 UTC, 1 replies.
- Give parallelize a dummy Arraylist length N to control RDD size? - posted by Jim <ji...@supportml.com> on 2015/11/28 00:18:02 UTC, 0 replies.
- Slow Job Execution in Spark with Spark Jobserver and Cassandra - posted by Patrick Brown <pa...@gmail.com> on 2015/11/28 02:03:44 UTC, 0 replies.
- How to get a single available message from kafka (case where OffsetRange.fromOffset == OffsetRange.untilOffset) - posted by Nikos Viorres <nv...@gmail.com> on 2015/11/28 15:35:53 UTC, 2 replies.
- df.partitionBy().parquet() java.lang.OutOfMemoryError: GC overhead limit exceeded - posted by Don Drake <do...@gmail.com> on 2015/11/28 16:02:59 UTC, 0 replies.
- streaming from Flume, save to Cassandra and Solr with Banana as search engine - posted by Oleksandr Yermolenko <aa...@sumix.com> on 2015/11/28 18:45:10 UTC, 0 replies.
- StructType for oracle.sql.STRUCT - posted by Pieter Minnaar <pi...@gmail.com> on 2015/11/28 21:15:20 UTC, 1 replies.
- Confirm this won't parallelize/partition? - posted by Jim Lohse <ji...@supportml.com> on 2015/11/28 21:23:07 UTC, 0 replies.
- General question on using StringIndexer in SparkML - posted by Vishnu Viswanath <vi...@gmail.com> on 2015/11/28 21:33:15 UTC, 4 replies.
- Spark and simulated annealing - posted by marfago <ma...@inwind.it> on 2015/11/28 23:25:59 UTC, 1 replies.
- Retrieve best parameters from CrossValidator - posted by BenFradet <be...@gmail.com> on 2015/11/29 00:36:06 UTC, 2 replies.
- storing result of aggregation of spark streaming - posted by Amir Rahnama <am...@gmail.com> on 2015/11/29 00:41:52 UTC, 2 replies.
- Parquet files not getting coalesced to smaller number of files - posted by SRK <sw...@gmail.com> on 2015/11/29 05:21:48 UTC, 1 replies.
- Debug Spark - posted by Masf <ma...@gmail.com> on 2015/11/29 17:18:25 UTC, 7 replies.
- Multiplication on decimals in a dataframe query - posted by Philip Dodds <ph...@gmail.com> on 2015/11/29 20:11:34 UTC, 0 replies.
- Re: How to work with a joined rdd in pyspark? - posted by Gylfi <gy...@berkeley.edu> on 2015/11/29 20:32:24 UTC, 5 replies.
- Re: partition RDD of images - posted by Gylfi <gy...@berkeley.edu> on 2015/11/30 07:27:58 UTC, 0 replies.
- spark sql throw java.lang.ArrayIndexOutOfBoundsException when use table.* - posted by "ouruia@cnsuning.com" <ou...@cnsuning.com> on 2015/11/30 07:30:35 UTC, 0 replies.
- PySpark failing on a mid-sized broadcast - posted by ameyc <am...@gmail.com> on 2015/11/30 10:09:03 UTC, 1 replies.
- dfs.blocksize is not applicable to some cases - posted by Jung <jb...@naver.com> on 2015/11/30 10:55:33 UTC, 1 replies.
- Persisting closed sessions to external store inside updateStateByKey - posted by Anthony Brew <at...@gmail.com> on 2015/11/30 11:43:45 UTC, 0 replies.
- Spark DStream Data stored out of order in Cassandra - posted by "Prateek ." <pr...@aricent.com> on 2015/11/30 12:37:46 UTC, 0 replies.
- No documentation for how to write custom Transformer in ml pipeline ? - posted by Jeff Zhang <zj...@gmail.com> on 2015/11/30 13:18:31 UTC, 0 replies.
- Re: Spark DStream Data stored out of order in Cassandra - posted by Gerard Maas <ge...@gmail.com> on 2015/11/30 15:45:07 UTC, 1 replies.
- Spark directStream with Kafka and process the lost messages. - posted by Guillermo Ortiz <ko...@gmail.com> on 2015/11/30 16:38:52 UTC, 2 replies.
- spark.cleaner.ttl for 1.4.1 - posted by Michal Čizmazia <mi...@gmail.com> on 2015/11/30 17:46:03 UTC, 1 replies.
- Logs of Custom Receiver - posted by Matthias Niehoff <ma...@codecentric.de> on 2015/11/30 18:00:20 UTC, 1 replies.
- Help with type check - posted by Eyal Sharon <ey...@scene53.com> on 2015/11/30 19:13:24 UTC, 1 replies.
- Spark 1.5.2 + Hive 1.0.0 in Amazon EMR 4.2.0 - posted by Daniel Lopes <da...@bankfacil.com.br> on 2015/11/30 23:14:28 UTC, 0 replies.