You are viewing a plain text version of this content. The canonical link for it is here.
- Re: SparkSql - java.util.NoSuchElementException: key not found: node when access JSON Array - posted by Todd Nist <ts...@gmail.com> on 2015/04/01 00:18:19 UTC, 1 replies.
- spark.sql.Row manipulation - posted by roni <ro...@gmail.com> on 2015/04/01 01:05:30 UTC, 1 replies.
- Re: How to configure SparkUI to use internal ec2 ip - posted by Anny Chen <an...@gmail.com> on 2015/04/01 01:17:24 UTC, 0 replies.
- deployment of spark on mesos and data locality in tachyon/hdfs - posted by Ankur Chauhan <ac...@brightcove.com> on 2015/04/01 01:30:35 UTC, 5 replies.
- Re: Query REST web service with Spark? - posted by Todd Nist <ts...@gmail.com> on 2015/04/01 02:06:38 UTC, 2 replies.
- Re: different result from implicit ALS with explicit ALS - posted by lisendong <li...@163.com> on 2015/04/01 03:45:42 UTC, 0 replies.
- Re: Can't run spark-submit with an application jar on a Mesos cluster - posted by seglo <wl...@gmail.com> on 2015/04/01 03:51:00 UTC, 0 replies.
- Minimum slots assigment to Spark on Mesos - posted by Stratos Dimopoulos <st...@gmail.com> on 2015/04/01 04:12:16 UTC, 0 replies.
- Spark SQL saveAsParquet failed after a few waves - posted by Yijie Shen <he...@gmail.com> on 2015/04/01 04:17:38 UTC, 2 replies.
- Re: Actor not found - posted by Shixiong Zhu <zs...@gmail.com> on 2015/04/01 04:59:07 UTC, 4 replies.
- Re: Implicit matrix factorization returning different results between spark 1.2.0 and 1.3.0 - posted by Xiangrui Meng <me...@gmail.com> on 2015/04/01 05:53:15 UTC, 1 replies.
- Anatomy of RDD : Deep dive into RDD data structure - posted by madhu phatak <ph...@gmail.com> on 2015/04/01 06:14:08 UTC, 1 replies.
- Creating Partitioned Parquet Tables via SparkSQL - posted by Denny Lee <de...@gmail.com> on 2015/04/01 06:35:08 UTC, 2 replies.
- Re: Broadcasting a parquet file using spark and python - posted by Jitesh chandra Mishra <ji...@gmail.com> on 2015/04/01 06:36:55 UTC, 1 replies.
- rdd.cache() not working ? - posted by "fightfate@163.com" <fi...@163.com> on 2015/04/01 07:21:25 UTC, 6 replies.
- SparkStreaming batch processing time question - posted by lu...@sina.com on 2015/04/01 08:05:26 UTC, 1 replies.
- Re: When do map how to get the line number? - posted by jitesh129 <ji...@gmail.com> on 2015/04/01 08:15:25 UTC, 0 replies.
- Re: Using 'fair' scheduler mode - posted by Raghavendra Pandey <ra...@gmail.com> on 2015/04/01 08:15:39 UTC, 2 replies.
- Re: --driver-memory parameter doesn't work for spark-submmit on yarn? - posted by Akhil Das <ak...@sigmoidanalytics.com> on 2015/04/01 08:40:22 UTC, 5 replies.
- Strategy regarding maximum number of executor's failure for log running jobs/ spark streaming jobs - posted by twinkle sachdeva <tw...@gmail.com> on 2015/04/01 08:52:44 UTC, 4 replies.
- 回复:Re: SparkStreaming batch processing time question - posted by lu...@sina.com on 2015/04/01 08:58:47 UTC, 0 replies.
- Spark + Kafka - posted by James King <ja...@gmail.com> on 2015/04/01 09:21:03 UTC, 6 replies.
- RE: Using 'fair' scheduler mode with thrift server - posted by Judy Nash <ju...@exchange.microsoft.com> on 2015/04/01 09:44:33 UTC, 0 replies.
- Re: Unable to save dataframe with UDT created with sqlContext.createDataFrame - posted by Jaonary Rabarisoa <ja...@gmail.com> on 2015/04/01 09:57:42 UTC, 2 replies.
- Disable stage logging to stdout - posted by Theodore Vasiloudis <th...@gmail.com> on 2015/04/01 11:56:53 UTC, 2 replies.
- Re: Spark Streaming and JMS - posted by danila <da...@gmail.com> on 2015/04/01 13:38:33 UTC, 1 replies.
- Spark 1.3.0 DataFrame and Postgres - posted by fergjo00 <jo...@gmail.com> on 2015/04/01 14:50:57 UTC, 2 replies.
- Re: Spark SQL queries hang forever - posted by Marius Soutier <mp...@gmail.com> on 2015/04/01 15:39:32 UTC, 8 replies.
- Spark throws rsync: change_dir errors on startup - posted by "Horsmann, Tobias" <to...@uni-due.de> on 2015/04/01 15:55:41 UTC, 2 replies.
- Re: Spark 1.3 build with hive support fails on JLine - posted by Ted Yu <yu...@gmail.com> on 2015/04/01 15:59:36 UTC, 0 replies.
- Size of arbitrary state managed via DStream updateStateByKey - posted by Vinoth Chandar <vi...@uber.com> on 2015/04/01 16:49:40 UTC, 2 replies.
- Error reading smallin in hive table with parquet format - posted by Masf <ma...@gmail.com> on 2015/04/01 16:53:21 UTC, 2 replies.
- Use with Data justifying Spark - posted by "Vila, Didier" <Di...@Teradata.com> on 2015/04/01 17:01:16 UTC, 2 replies.
- [SparkSQL] Zip DataFrame with a RDD - posted by Jaonary Rabarisoa <ja...@gmail.com> on 2015/04/01 17:31:10 UTC, 0 replies.
- Spark 1.3.0 missing dependency? - posted by ARose <As...@telarix.com> on 2015/04/01 17:46:06 UTC, 1 replies.
- Spark Streaming 1.3 & Kafka Direct Streams - posted by Neelesh <ne...@gmail.com> on 2015/04/01 18:18:37 UTC, 9 replies.
- Issue on Spark SQL insert or create table with Spark running on AWS EMR -- s3n.S3NativeFileSystem: rename never finished - posted by chutium <te...@gmail.com> on 2015/04/01 18:34:01 UTC, 2 replies.
- HiveContext setConf seems not stable - posted by Hao Ren <in...@gmail.com> on 2015/04/01 18:38:54 UTC, 6 replies.
- Latest enhancement in Low Level Receiver based Kafka Consumer - posted by Dibyendu Bhattacharya <di...@gmail.com> on 2015/04/01 18:49:00 UTC, 2 replies.
- Unable to run Spark application - posted by Vijayasarathy Kannan <kv...@vt.edu> on 2015/04/01 19:09:54 UTC, 4 replies.
- Re: How to start master and workers on Windows - posted by ARose <As...@telarix.com> on 2015/04/01 19:19:35 UTC, 0 replies.
- Re: Spark Streaming S3 Performance Implications - posted by Mike Trienis <mi...@orcsol.com> on 2015/04/01 19:43:47 UTC, 0 replies.
- SparkSQL - Caching RDDs - posted by "Venkat, Ankam" <An...@centurylink.com> on 2015/04/01 20:02:29 UTC, 0 replies.
- Re: SparkSQL - Caching RDDs - posted by Michael Armbrust <mi...@databricks.com> on 2015/04/01 20:07:07 UTC, 0 replies.
- Re: How to specify the port for AM Actor ... - posted by Manoj Samel <ma...@gmail.com> on 2015/04/01 20:12:22 UTC, 0 replies.
- Spark-EC2 Security Group Error - posted by "Ganelin, Ilya" <Il...@capitalone.com> on 2015/04/01 20:19:34 UTC, 1 replies.
- Spark Sql - Missing Jar ? json_tuple NoClassDefFoundError - posted by Todd Nist <ts...@gmail.com> on 2015/04/01 21:19:21 UTC, 13 replies.
- Re: Spark streaming with Kafka, multiple partitions fail, single partition ok - posted by Nicolas Phung <ni...@gmail.com> on 2015/04/01 21:22:16 UTC, 2 replies.
- persist(MEMORY_ONLY) takes lot of time - posted by SamyaMaiti <sa...@gmail.com> on 2015/04/01 21:33:13 UTC, 1 replies.
- Spark on EC2 - posted by Vadim Bichutskiy <va...@gmail.com> on 2015/04/01 21:41:25 UTC, 1 replies.
- Spark permission denied error when invoking saveAsTextFile - posted by Kannan Rajah <kr...@maprtech.com> on 2015/04/01 22:37:13 UTC, 1 replies.
- Spark 1.3.0 DataFrame count() method throwing java.io.EOFException - posted by ARose <As...@telarix.com> on 2015/04/01 23:57:43 UTC, 3 replies.
- Quick GraphX gutcheck - posted by hokiegeek2 <so...@gmail.com> on 2015/04/02 00:31:34 UTC, 1 replies.
- Spark SQL does not read from cached table if table is renamed - posted by Judy Nash <ju...@exchange.microsoft.com> on 2015/04/02 01:05:08 UTC, 2 replies.
- Re: Streaming anomaly detection using ARIMA - posted by Corey Nolet <cj...@gmail.com> on 2015/04/02 01:31:48 UTC, 3 replies.
- pyspark hbase range scan - posted by Eric Kimbrel <er...@soteradefense.com> on 2015/04/02 01:50:53 UTC, 2 replies.
- Re: Can I call aggregate UDF in DataFrame? - posted by Reynold Xin <rx...@databricks.com> on 2015/04/02 02:11:10 UTC, 1 replies.
- SparkR newHadoopAPIRDD - posted by Corey Nolet <cj...@gmail.com> on 2015/04/02 02:35:10 UTC, 0 replies.
- Spark, snappy and HDFS - posted by Nick Travers <n....@gmail.com> on 2015/04/02 04:13:42 UTC, 6 replies.
- Re: How to setup a Spark Cluter? - posted by amghost <zh...@gmail.com> on 2015/04/02 04:35:25 UTC, 0 replies.
- Incorrect results of spark sql on hive select - posted by LinQili <li...@outlook.com> on 2015/04/02 05:18:29 UTC, 0 replies.
- Data locality across jobs - posted by kjsingh <ka...@guavus.com> on 2015/04/02 06:09:33 UTC, 2 replies.
- Starting httpd: http: Syntax error on line 154 - posted by Ganon Pierce <ga...@me.com> on 2015/04/02 08:05:16 UTC, 2 replies.
- StackOverflow Problem with 1.3 mllib ALS - posted by Justin Yip <yi...@prediction.io> on 2015/04/02 08:54:46 UTC, 3 replies.
- JAVA_HOME problem - posted by 董帅阳 <91...@qq.com> on 2015/04/02 09:53:54 UTC, 5 replies.
- how to find near duplicate items from given dataset using spark - posted by Somnath Pandeya <So...@infosys.com> on 2015/04/02 10:18:23 UTC, 0 replies.
- How to learn Spark ? - posted by Star Guo <st...@ceph.me> on 2015/04/02 10:19:01 UTC, 8 replies.
- Setup Spark jobserver for Spark SQL - posted by Harika <ma...@gmail.com> on 2015/04/02 11:10:05 UTC, 1 replies.
- there are about 50% all-zero vector in the als result - posted by lisendong <li...@163.com> on 2015/04/02 11:43:57 UTC, 5 replies.
- Support for Data flow graphs and not DAG only - posted by anshu shukla <an...@gmail.com> on 2015/04/02 11:47:32 UTC, 0 replies.
- 回复:How to learn Spark ? - posted by lu...@sina.com on 2015/04/02 12:10:18 UTC, 0 replies.
- [SparkSQL 1.3.0] Cannot resolve column name "SUM('p.q)" among (k, SUM('p.q)); - posted by Haopu Wang <HW...@qilinsoft.com> on 2015/04/02 12:29:43 UTC, 2 replies.
- Mllib kmeans #iteration - posted by podioss <gr...@hotmail.com> on 2015/04/02 13:42:38 UTC, 2 replies.
- Connection pooling in spark jobs - posted by Sateesh Kavuri <sa...@gmail.com> on 2015/04/02 13:45:25 UTC, 7 replies.
- Matrix Transpose - posted by Spico Florin <sp...@gmail.com> on 2015/04/02 14:15:02 UTC, 0 replies.
- Error in SparkSQL/Scala IDE - posted by Sathish Kumaran Vairavelu <vs...@gmail.com> on 2015/04/02 14:39:22 UTC, 2 replies.
- A stream of json objects using Java - posted by James King <ja...@gmail.com> on 2015/04/02 15:22:40 UTC, 1 replies.
- From DataFrame to LabeledPoint - posted by drarse <dr...@gmail.com> on 2015/04/02 16:17:51 UTC, 4 replies.
- Spark streaming error in block pushing thread - posted by Bill Young <bi...@threatstack.com> on 2015/04/02 16:26:31 UTC, 0 replies.
- A problem with Spark 1.3 artifacts - posted by Jacek Lewandowski <ja...@datastax.com> on 2015/04/02 16:34:14 UTC, 4 replies.
- Re: Re:How to learn Spark ? - posted by Star Guo <st...@ceph.me> on 2015/04/02 17:18:48 UTC, 0 replies.
- Re How to learn Spark ? - posted by Star Guo <st...@ceph.me> on 2015/04/02 17:21:18 UTC, 0 replies.
- Spark Streaming Worker runs out of inodes - posted by andrem <am...@gmail.com> on 2015/04/02 17:23:35 UTC, 4 replies.
- Re: workers no route to host - posted by Dean Wampler <de...@gmail.com> on 2015/04/02 17:29:55 UTC, 0 replies.
- conversion from java collection type to scala JavaRDD - posted by Jeetendra Gangele <ga...@gmail.com> on 2015/04/02 17:33:02 UTC, 5 replies.
- Spark Streaming Error in block pushing thread - posted by byoung <bi...@threatstack.com> on 2015/04/02 17:45:19 UTC, 4 replies.
- Spark SQL. Memory consumption - posted by Masf <ma...@gmail.com> on 2015/04/02 17:46:48 UTC, 15 replies.
- Is the disk space in SPARK_LOCAL_DIRS cleanned up? - posted by "Wang, Ningjun (LNG-NPV)" <ni...@lexisnexis.com> on 2015/04/02 18:14:00 UTC, 8 replies.
- Spark + Kinesis - posted by Vadim Bichutskiy <va...@gmail.com> on 2015/04/02 18:53:32 UTC, 14 replies.
- Reading a large file (binary) into RDD - posted by Vijayasarathy Kannan <kv...@vt.edu> on 2015/04/02 19:33:01 UTC, 8 replies.
- Spark SQL 1.3.0 - spark-shell error : HiveMetastoreCatalog.class refers to term cache in package com.google.common which is not available - posted by Todd Nist <ts...@gmail.com> on 2015/04/02 19:56:33 UTC, 2 replies.
- Spark 1.3 UDF ClassNotFoundException - posted by ganterm <ga...@gmail.com> on 2015/04/02 20:06:59 UTC, 2 replies.
- Generating a schema in Spark 1.3 failed while using DataTypes. - posted by ogoh <ok...@gmail.com> on 2015/04/02 20:45:13 UTC, 4 replies.
- Simple but faster data streaming - posted by Harut Martirosyan <ha...@gmail.com> on 2015/04/02 20:51:48 UTC, 1 replies.
- RE: Date and decimal datatype not working - posted by "BASAK, ANANDA" <ab...@att.com> on 2015/04/02 21:26:23 UTC, 0 replies.
- Mesos - spark task constraints - posted by Ankur Chauhan <ac...@brightcove.com> on 2015/04/02 21:47:28 UTC, 0 replies.
- Need a spark mllib tutorial - posted by "Phani Yadavilli -X (pyadavil)" <py...@cisco.com> on 2015/04/02 21:51:54 UTC, 1 replies.
- Re: input size too large | Performance issues with Spark - posted by Christian Perez <ch...@svds.com> on 2015/04/02 21:55:13 UTC, 1 replies.
- Re: Submitting to a cluster behind a VPN, configuring different IP address - posted by Michael Quinlan <mq...@gmail.com> on 2015/04/02 23:18:27 UTC, 2 replies.
- Re: "Spark-events does not exist" error, while it does with all the req. rights - posted by Marcelo Vanzin <va...@cloudera.com> on 2015/04/03 03:49:32 UTC, 0 replies.
- ArrayBuffer within a DataFrame - posted by Denny Lee <de...@gmail.com> on 2015/04/03 04:10:17 UTC, 7 replies.
- Cannot run the example in the Spark 1.3.0 following the document - posted by java8964 <ja...@hotmail.com> on 2015/04/03 04:15:18 UTC, 4 replies.
- maven compile error - posted by myelinji <my...@aliyun.com> on 2015/04/03 04:19:59 UTC, 2 replies.
- Tableau + Spark SQL Thrift Server + Cassandra - posted by Mohammed Guller <mo...@glassbeam.com> on 2015/04/03 04:20:39 UTC, 12 replies.
- [SQL] Simple DataFrame questions - posted by Yana Kadiyska <ya...@gmail.com> on 2015/04/03 04:45:38 UTC, 2 replies.
- Delaying failed task retries + giving failing tasks to different nodes - posted by Stephen Merity <st...@commoncrawl.org> on 2015/04/03 05:11:32 UTC, 1 replies.
- Fwd: - posted by Himanish Kushary <hi...@gmail.com> on 2015/04/03 07:05:50 UTC, 0 replies.
- Matei Zaharai: Reddit Ask Me Anything - posted by ben lorica <da...@gmail.com> on 2015/04/03 07:12:42 UTC, 1 replies.
- Spark Application Stages and DAG - posted by Vijay Innamuri <vi...@gmail.com> on 2015/04/03 09:35:40 UTC, 3 replies.
- About Waiting batches on the spark streaming UI - posted by "bit1129@163.com" <bi...@163.com> on 2015/04/03 09:59:28 UTC, 4 replies.
- Spark Job Failed - Class not serializable - posted by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com on 2015/04/03 12:23:46 UTC, 5 replies.
- 答复:maven compile error - posted by myelinji <my...@aliyun.com> on 2015/04/03 12:58:25 UTC, 1 replies.
- Which OS for Spark cluster nodes? - posted by "Horsmann, Tobias" <to...@uni-due.de> on 2015/04/03 13:28:48 UTC, 2 replies.
- Parquet timestamp support for Hive? - posted by Rex Xiong <by...@gmail.com> on 2015/04/03 13:59:48 UTC, 1 replies.
- Re: How to get a top X percent of a distribution represented as RDD - posted by Aung Htet <au...@gmail.com> on 2015/04/03 14:20:12 UTC, 1 replies.
- Spark unit test fails - posted by Manas Kar <ma...@gmail.com> on 2015/04/03 15:39:01 UTC, 3 replies.
- Spark Memory Utilities - posted by Stephen Carman <sc...@coldlight.com> on 2015/04/03 17:45:24 UTC, 0 replies.
- MLlib: save models to HDFS? - posted by "S. Zhou" <my...@yahoo.com.INVALID> on 2015/04/03 18:16:19 UTC, 2 replies.
- variant record by case classes in shell fails? - posted by Michael Albert <m_...@yahoo.com.INVALID> on 2015/04/03 20:45:47 UTC, 1 replies.
- Spark Streaming FileStream Nested File Support - posted by adamgerst <ad...@gmail.com> on 2015/04/03 21:23:09 UTC, 4 replies.
- Spark TeraSort source request - posted by Tom <th...@gmail.com> on 2015/04/03 21:47:50 UTC, 4 replies.
- spark mesos deployment : starting workers based on attributes - posted by Ankur Chauhan <ac...@brightcove.com> on 2015/04/03 22:08:06 UTC, 3 replies.
- Regarding MLLIB sparse and dense matrix - posted by Jeetendra Gangele <ga...@gmail.com> on 2015/04/03 22:24:30 UTC, 1 replies.
- Re: WordCount example - posted by Mohit Anchlia <mo...@gmail.com> on 2015/04/03 23:18:45 UTC, 5 replies.
- Migrating from Spark 0.8.0 to Spark 1.3.0 - posted by Ritesh Kumar Singh <ri...@gmail.com> on 2015/04/04 01:11:20 UTC, 1 replies.
- Re: kmeans|| in Spark is not real paralleled? - posted by Xi Shen <da...@gmail.com> on 2015/04/04 03:44:03 UTC, 0 replies.
- Issue of sqlContext.createExternalTable with parquet partition discovery after changing folder structure - posted by Rex Xiong <by...@gmail.com> on 2015/04/04 11:24:37 UTC, 1 replies.
- Spark Vs MR - posted by SamyaMaiti <sa...@gmail.com> on 2015/04/04 12:19:43 UTC, 1 replies.
- Re: 4 seconds to count 13M lines. Does it make sense? - posted by SamyaMaiti <sa...@gmail.com> on 2015/04/04 12:28:06 UTC, 1 replies.
- Need help with ALS Recommendation code - posted by "Phani Yadavilli -X (pyadavil)" <py...@cisco.com> on 2015/04/04 14:06:22 UTC, 2 replies.
- Re: Parquet Hive table become very slow on 1.3? - posted by Cheng Lian <li...@gmail.com> on 2015/04/04 15:47:45 UTC, 4 replies.
- newAPIHadoopRDD Mutiple scan result return from Hbase - posted by Jeetendra Gangele <ga...@gmail.com> on 2015/04/04 17:24:00 UTC, 8 replies.
- Processing Time Spikes (Spark Streaming) - posted by t1ny <wb...@gmail.com> on 2015/04/04 20:55:46 UTC, 0 replies.
- UNRESOLVED DEPENDENCIES while building Spark 1.3.0 - posted by mas <ma...@gmail.com> on 2015/04/04 22:33:28 UTC, 2 replies.
- CPU Usage for Spark Local Mode - posted by Wenlei Xie <we...@gmail.com> on 2015/04/04 23:30:47 UTC, 0 replies.
- DataFrame groupBy MapType - posted by Justin Yip <yi...@prediction.io> on 2015/04/05 02:25:31 UTC, 3 replies.
- Spark Streaming program questions - posted by nickos168 <ni...@yahoo.com.INVALID> on 2015/04/05 03:13:37 UTC, 2 replies.
- Re: Spark SQL Self join with agreegate - posted by SachinJanani <sa...@gmail.com> on 2015/04/05 07:02:19 UTC, 0 replies.
- Spark streaming with Kafka- couldnt find KafkaUtils - posted by Priya Ch <le...@gmail.com> on 2015/04/05 08:30:42 UTC, 2 replies.
- Pseudo Spark Streaming ? - posted by Bahubali Jain <ba...@gmail.com> on 2015/04/05 12:43:35 UTC, 1 replies.
- Re: Low resource when upgrading from 1.1.0 to 1.3.0 - posted by nsalian <ne...@gmail.com> on 2015/04/05 20:35:49 UTC, 1 replies.
- Diff between foreach and foreachsync - posted by Jeetendra Gangele <ga...@gmail.com> on 2015/04/05 22:06:53 UTC, 0 replies.
- Sending RDD object over the network - posted by raggy <ra...@gmail.com> on 2015/04/05 22:26:38 UTC, 2 replies.
- Re: NoSuchMethodException KafkaUtils. - posted by Yamini <ya...@gmail.com> on 2015/04/05 23:29:04 UTC, 0 replies.
- Add row IDs column to data frame - posted by olegshirokikh <ol...@solver.com> on 2015/04/06 04:50:31 UTC, 6 replies.
- Create DataFrame from textFile with unknown columns - posted by olegshirokikh <ol...@solver.com> on 2015/04/06 05:49:19 UTC, 0 replies.
- Re: Write to Parquet File in Python - posted by Akriti23 <sh...@gmail.com> on 2015/04/06 11:42:39 UTC, 0 replies.
- Cannot build "learning spark" project - posted by Adamantios Corais <ad...@gmail.com> on 2015/04/06 13:23:26 UTC, 1 replies.
- Spark 1.3.0: Running Pi example on YARN fails - posted by Zork Sail <zo...@gmail.com> on 2015/04/06 14:00:48 UTC, 10 replies.
- Learning Spark - posted by Abhideep Chakravarty <Ab...@mindtree.com> on 2015/04/06 14:30:04 UTC, 0 replies.
- RDD generated on every query - posted by Siddharth Ubale <si...@syncoms.com> on 2015/04/06 14:31:40 UTC, 2 replies.
- Re: Learning Spark - posted by Akhil Das <ak...@sigmoidanalytics.com> on 2015/04/06 14:59:04 UTC, 1 replies.
- Using DIMSUM with ids - posted by James <al...@gmail.com> on 2015/04/06 15:08:36 UTC, 2 replies.
- (send this email to subscribe) - posted by 林晨 <be...@gmail.com> on 2015/04/06 15:52:52 UTC, 1 replies.
- What happened to the Row class in 1.3.0? - posted by ARose <As...@telarix.com> on 2015/04/06 16:23:02 UTC, 5 replies.
- DataFrame -- help with encoding factor variables - posted by Yana Kadiyska <ya...@gmail.com> on 2015/04/06 16:31:09 UTC, 1 replies.
- Spark Avarage - posted by barisak <ba...@gmail.com> on 2015/04/06 16:50:16 UTC, 3 replies.
- How to work with sparse data in Python? - posted by SecondDatke <lo...@outlook.com> on 2015/04/06 16:58:08 UTC, 1 replies.
- Re: java.io.NotSerializableException: org.apache.hadoop.hbase.client.Result - posted by Jeetendra Gangele <ga...@gmail.com> on 2015/04/06 20:37:07 UTC, 1 replies.
- Spark SQL Parquet as External table - 1.3.x HiveMetastoreType now hidden - posted by Todd Nist <ts...@gmail.com> on 2015/04/06 20:37:59 UTC, 5 replies.
- java.lang.ClassCastException: scala.Tuple2 cannot be cast to org.apache.spark.mllib.regression.LabeledPoint - posted by Joanne Contact <jo...@gmail.com> on 2015/04/06 20:59:15 UTC, 1 replies.
- org.apache.spark.ml.recommendation.ALS - posted by Jay Katukuri <jk...@apple.com> on 2015/04/06 21:06:23 UTC, 9 replies.
- How to restrict foreach on a streaming RDD only once upon receiver completion - posted by Hari Polisetty <hp...@icloud.com> on 2015/04/06 21:31:46 UTC, 4 replies.
- task not serialize - posted by Jeetendra Gangele <ga...@gmail.com> on 2015/04/06 22:30:39 UTC, 7 replies.
- SparkSQL + Parquet performance - posted by Paolo Platter <pa...@agilelab.it> on 2015/04/06 23:18:31 UTC, 0 replies.
- Spark Druid integration - posted by Paolo Platter <pa...@agilelab.it> on 2015/04/06 23:23:47 UTC, 1 replies.
- Spark SQL code generation - posted by Akshat Aranya <aa...@gmail.com> on 2015/04/06 23:32:49 UTC, 3 replies.
- Broadcast value return empty after turn to org.apache.spark.serializer.KryoSerializer - posted by Shuai Zheng <sz...@gmail.com> on 2015/04/06 23:33:49 UTC, 0 replies.
- Processing Large Images in Spark? - posted by Patrick Young <pa...@gmail.com> on 2015/04/07 00:05:04 UTC, 3 replies.
- Super slow caching in 1.3? - posted by Christian Perez <ch...@svds.com> on 2015/04/07 02:00:36 UTC, 8 replies.
- Seeing message about receiver not being de-registered on invoking Streaming context stop - posted by Hari Polisetty <hp...@icloud.com> on 2015/04/07 05:16:14 UTC, 2 replies.
- Spark 1.2.0 with Play/Activator - posted by Manish Gupta 8 <mg...@sapient.com> on 2015/04/07 07:23:46 UTC, 4 replies.
- graphx running time - posted by daze5112 <da...@ato.gov.au> on 2015/04/07 07:39:35 UTC, 0 replies.
- Microsoft SQL jdbc support from spark sql - posted by bipin <bi...@gmail.com> on 2015/04/07 07:40:47 UTC, 13 replies.
- Regarding benefits of using more than one cpu for a task in spark - posted by twinkle sachdeva <tw...@gmail.com> on 2015/04/07 09:01:26 UTC, 1 replies.
- [DAGSchedule][OutputCommitCoordinator] OutputCommitCoordinator.authorizedCommittersByStage Map Out Of Memory - posted by Tao Li <li...@gmail.com> on 2015/04/07 10:56:57 UTC, 0 replies.
- Can not get executor's Log from Spark's History Server - posted by donhoff_h <16...@qq.com> on 2015/04/07 11:51:02 UTC, 1 replies.
- [GraphX] aggregateMessages with active set - posted by James <al...@gmail.com> on 2015/04/07 11:56:49 UTC, 4 replies.
- 'Java heap space' error occured when query 4G data file from HDFS - posted by 李铖 <li...@gmail.com> on 2015/04/07 12:36:18 UTC, 4 replies.
- scala.MatchError: class org.apache.avro.Schema (of class java.lang.Class) - posted by Yamini <ya...@gmail.com> on 2015/04/07 12:57:17 UTC, 3 replies.
- Issue with pyspark 1.3.0, sql package and rows - posted by Stefano Parmesan <pa...@spaziodati.eu> on 2015/04/07 13:22:25 UTC, 3 replies.
- Difference between textFile Vs hadoopFile (textInoutFormat) on HDFS data - posted by Puneet Kumar Ojha <pu...@pubmatic.com> on 2015/04/07 13:44:30 UTC, 2 replies.
- Pipelines for controlling workflow - posted by Staffan <st...@gmail.com> on 2015/04/07 13:56:38 UTC, 0 replies.
- Advice using Spark SQL and Thrift JDBC Server - posted by James Aley <ja...@swiftkey.com> on 2015/04/07 15:29:13 UTC, 13 replies.
- ML consumption time based on data volume - same cluster - posted by Vasyl Harasymiv <va...@gmail.com> on 2015/04/07 16:15:35 UTC, 2 replies.
- RE: [BUG]Broadcast value return empty after turn to org.apache.spark.serializer.KryoSerializer - posted by Shuai Zheng <sz...@gmail.com> on 2015/04/07 16:29:12 UTC, 1 replies.
- Does spark utilize the sorted order of hbase keys, when using hbase as data source - posted by Юра <rv...@gmail.com> on 2015/04/07 16:36:18 UTC, 3 replies.
- The differentce between SparkSql/DataFram join and Rdd join - posted by Hao Ren <in...@gmail.com> on 2015/04/07 17:33:53 UTC, 3 replies.
- Re: RDD collect hangs on large input data - posted by Jon Chase <jo...@gmail.com> on 2015/04/07 18:19:12 UTC, 2 replies.
- FlatMapPair run for longer time - posted by Jeetendra Gangele <ga...@gmail.com> on 2015/04/07 18:54:51 UTC, 1 replies.
- Incremently load big RDD file into Memory - posted by mas <ma...@gmail.com> on 2015/04/07 19:04:12 UTC, 3 replies.
- Array[T].distinct doesn't work inside RDD - posted by anny9699 <an...@gmail.com> on 2015/04/07 20:38:59 UTC, 3 replies.
- How to generate Java bean class for avro files using spark avro project - posted by Yamini <ya...@gmail.com> on 2015/04/07 20:43:13 UTC, 0 replies.
- How to get SparkSql results on a webpage on real time - posted by "Mukund Ranjan (muranjan)" <mu...@cisco.com> on 2015/04/07 21:10:12 UTC, 0 replies.
- set spark.storage.memoryFraction to 0 when no cached RDD and memory area for broadcast value? - posted by Shuai Zheng <sz...@gmail.com> on 2015/04/07 21:15:53 UTC, 2 replies.
- broken link on Spark Programming Guide - posted by jonathangreenleaf <jo...@gmail.com> on 2015/04/07 22:32:43 UTC, 3 replies.
- parquet partition discovery - posted by Christopher Petro <cp...@kcg.com> on 2015/04/07 23:21:43 UTC, 2 replies.
- How to use Joda Time with Spark SQL? - posted by adamgerst <ad...@gmail.com> on 2015/04/08 00:09:55 UTC, 5 replies.
- Unable to run spark examples on cloudera - posted by Georgi Knox <gk...@bit.ly> on 2015/04/08 00:23:08 UTC, 0 replies.
- Job submission API - posted by Prashant Kommireddi <pr...@gmail.com> on 2015/04/08 01:01:20 UTC, 3 replies.
- Error when running Spark on Windows 8.1 - posted by Arun Lists <li...@gmail.com> on 2015/04/08 01:54:02 UTC, 0 replies.
- Drools in Spark - posted by Sathish Kumaran Vairavelu <vs...@gmail.com> on 2015/04/08 01:59:09 UTC, 0 replies.
- Expected behavior for DataFrame.unionAll - posted by Justin Yip <yi...@prediction.io> on 2015/04/08 02:00:14 UTC, 3 replies.
- Specifying Spark property from command line? - posted by Arun Lists <li...@gmail.com> on 2015/04/08 02:00:57 UTC, 1 replies.
- HiveThriftServer2 - posted by Mohammed Guller <mo...@glassbeam.com> on 2015/04/08 02:58:11 UTC, 2 replies.
- Timeout errors from Akka in Spark 1.2.1 - posted by Nikunj Bansal <nb...@gmail.com> on 2015/04/08 03:00:04 UTC, 8 replies.
- value reduceByKeyAndWindow is not a member of org.apache.spark.streaming.dstream.DStream[(String, Int)] - posted by Su She <su...@gmail.com> on 2015/04/08 03:08:09 UTC, 1 replies.
- DataFrame degraded performance after DataFrame.cache - posted by Justin Yip <yi...@prediction.io> on 2015/04/08 03:31:19 UTC, 5 replies.
- Cannot change the memory of workers - posted by Jia Yu <ji...@asu.edu> on 2015/04/08 03:57:21 UTC, 1 replies.
- Re: streamSQL - is it available or is it in POC ? - posted by haopu <hw...@qilinsoft.com> on 2015/04/08 05:55:33 UTC, 0 replies.
- EC2 spark-submit --executor-memory - posted by spark_user_2015 <li...@adobe.com> on 2015/04/08 05:58:42 UTC, 1 replies.
- Caching and Actions - posted by spark_user_2015 <li...@adobe.com> on 2015/04/08 06:09:07 UTC, 4 replies.
- Unable to specify multiple directories as input - posted by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com on 2015/04/08 06:19:43 UTC, 1 replies.
- need info on Spark submit on yarn-cluster mode - posted by sachin Singh <sa...@gmail.com> on 2015/04/08 08:24:39 UTC, 1 replies.
- Need subscription process - posted by Jeetendra Gangele <ga...@gmail.com> on 2015/04/08 10:47:37 UTC, 1 replies.
- partition by category - posted by SiMaYunRui <my...@hotmail.com> on 2015/04/08 11:00:10 UTC, 0 replies.
- Spark Tasks failing with "Cannot find address" - posted by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com on 2015/04/08 11:00:19 UTC, 1 replies.
- Opening many Parquet files = slow - posted by Eric Eijkelenboom <er...@gmail.com> on 2015/04/08 15:15:15 UTC, 9 replies.
- Spark SQL Avro Library for 1.2 - posted by roy <rp...@njit.edu> on 2015/04/08 15:17:12 UTC, 0 replies.
- Subscribe - posted by Idris Ali <ps...@gmail.com> on 2015/04/08 15:28:45 UTC, 1 replies.
- Spark 1.3 on CDH 5.3.1 YARN - posted by roy <rp...@njit.edu> on 2015/04/08 15:34:31 UTC, 1 replies.
- Error running Spark on Cloudera - posted by Vijayasarathy Kannan <kv...@vt.edu> on 2015/04/08 15:45:28 UTC, 1 replies.
- Maintaining state - posted by boston2004_williams <bo...@yahoo.com> on 2015/04/08 17:26:57 UTC, 0 replies.
- [ThriftServer] User permissions warning - posted by Yana Kadiyska <ya...@gmail.com> on 2015/04/08 17:49:50 UTC, 1 replies.
- Exception in thread "main" java.util.concurrent.TimeoutException: Futures timed out after [10000 milliseconds] when create context - posted by Shuai Zheng <sz...@gmail.com> on 2015/04/08 18:18:59 UTC, 2 replies.
- start-slave.sh not starting - posted by Mohit Anchlia <mo...@gmail.com> on 2015/04/08 18:54:53 UTC, 1 replies.
- Reading file with Unicode characters - posted by Arun Lists <li...@gmail.com> on 2015/04/08 19:35:18 UTC, 2 replies.
- Spark Streaming and SQL - posted by Vadim Bichutskiy <va...@gmail.com> on 2015/04/08 20:38:55 UTC, 1 replies.
- Class incompatible error - posted by Mohit Anchlia <mo...@gmail.com> on 2015/04/08 21:43:58 UTC, 5 replies.
- Support for Joda - posted by Patrick Grandjean <p....@gmail.com> on 2015/04/08 21:53:33 UTC, 1 replies.
- Unit testing with HiveContext - posted by Daniel Siegmann <da...@teamaol.com> on 2015/04/08 22:07:09 UTC, 2 replies.
- Empty RDD? - posted by Vadim Bichutskiy <va...@gmail.com> on 2015/04/08 22:08:06 UTC, 3 replies.
- sortByKey with multiple partitions - posted by Tom <th...@gmail.com> on 2015/04/09 00:01:43 UTC, 1 replies.
- function to convert to pair - posted by Jeetendra Gangele <ga...@gmail.com> on 2015/04/09 00:40:10 UTC, 2 replies.
- Re: Cannot run unit test. - posted by Mike Trienis <mi...@orcsol.com> on 2015/04/09 00:59:00 UTC, 0 replies.
- Regarding GroupBy - posted by Jeetendra Gangele <ga...@gmail.com> on 2015/04/09 01:15:38 UTC, 0 replies.
- Pyspark query by binary type - posted by jmalm <jo...@gmail.com> on 2015/04/09 01:52:02 UTC, 0 replies.
- External JARs not loading Spark Shell Scala 2.11 - posted by anakos <an...@gmail.com> on 2015/04/09 11:38:25 UTC, 11 replies.
- Spark Streaming scenarios - posted by Vinay Kesarwani <vn...@gmail.com> on 2015/04/09 11:59:34 UTC, 0 replies.
- SQL can't not create Hive database - posted by Hao Ren <in...@gmail.com> on 2015/04/09 13:59:00 UTC, 4 replies.
- override log4j.properties - posted by patcharee <Pa...@uni.no> on 2015/04/09 14:17:28 UTC, 1 replies.
- save as text file throwing null pointer error. - posted by Somnath Pandeya <So...@infosys.com> on 2015/04/09 15:16:43 UTC, 0 replies.
- Join on Spark too slow. - posted by Kostas Kloudas <kk...@gmail.com> on 2015/04/09 15:28:12 UTC, 5 replies.
- Pairwise computations within partition - posted by abellet <au...@telecom-paristech.fr> on 2015/04/09 15:54:55 UTC, 1 replies.
- Kryo exception : Encountered unregistered class ID: 13994 - posted by mehdisinger <me...@lampiris.be> on 2015/04/09 16:15:04 UTC, 4 replies.
- How to submit job in a different user? - posted by SecondDatke <lo...@outlook.com> on 2015/04/09 16:17:27 UTC, 0 replies.
- Jobs failing with KryoException (BufferOverflow) - posted by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com on 2015/04/09 16:28:32 UTC, 6 replies.
- spark job progress-style report on console ? - posted by roy <rp...@njit.edu> on 2015/04/09 16:33:27 UTC, 1 replies.
- Any success on embedding local Spark in OSGi? - posted by Deniz Acay <de...@gmail.com> on 2015/04/09 16:44:44 UTC, 0 replies.
- RDD union - posted by Debasish Das <de...@gmail.com> on 2015/04/09 16:45:24 UTC, 0 replies.
- Lookup / Access of master data in spark streaming - posted by Amit Assudani <aa...@impetus.com> on 2015/04/09 17:20:52 UTC, 0 replies.
- Which Hive version should be used for Spark 1.3 - posted by Arthur Chan <ar...@gmail.com> on 2015/04/09 17:55:12 UTC, 2 replies.
- Spark Job Run Resource Estimation ? - posted by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com on 2015/04/09 18:21:50 UTC, 2 replies.
- Re: "Could not compute split, block not found" in Spark Streaming Simple Application - posted by Saiph Kappa <sa...@gmail.com> on 2015/04/09 18:56:31 UTC, 2 replies.
- Spark Job #of attempts ? - posted by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com on 2015/04/09 19:31:40 UTC, 3 replies.
- Continuous WARN messages from BlockManager about block replication - posted by Nandan Tammineedi <na...@defend7.com> on 2015/04/09 20:24:43 UTC, 4 replies.
- make two rdd co-partitioned in python - posted by pop <xi...@adobe.com> on 2015/04/09 20:57:51 UTC, 1 replies.
- Re: Lookup / Access of master data in spark streaming - posted by Tathagata Das <td...@databricks.com> on 2015/04/09 21:13:34 UTC, 1 replies.
- Overlapping classes warnings - posted by Ritesh Kumar Singh <ri...@gmail.com> on 2015/04/09 22:54:55 UTC, 7 replies.
- Spark Daytona Gray Sort Benchmark Configuration - posted by Soila Pertet Kavulya <sk...@gmail.com> on 2015/04/09 23:51:59 UTC, 2 replies.
- Need help in caching and persist - posted by Jeetendra Gangele <ga...@gmail.com> on 2015/04/10 00:25:59 UTC, 0 replies.
- Text By the Bay: applied NLP/ML with lots of Scala & Spark, April 24-25, San Francisco - posted by Alexy Khrabrov <al...@scalable.pro> on 2015/04/10 00:41:58 UTC, 0 replies.
- Spark SQL Parquet issues on 1.3.1 rc2 / 1.4.0 SNAPSHOT - posted by Anand Mohan <ch...@gmail.com> on 2015/04/10 02:37:01 UTC, 0 replies.
- Questions about Spark Internals and Implementation - posted by raggy <ra...@gmail.com> on 2015/04/10 04:09:59 UTC, 1 replies.
- ClassCastException when calling updateStateKey - posted by prai100 <pr...@gmail.com> on 2015/04/10 04:47:55 UTC, 5 replies.
- replace default application.conf file in spark-submit - posted by varvind <vi...@gmail.com> on 2015/04/10 05:40:22 UTC, 0 replies.
- Keep local variable - posted by TJ Klein <TJ...@gmail.com> on 2015/04/10 06:41:29 UTC, 5 replies.
- stream caching - posted by Vinay Kesarwani <vn...@gmail.com> on 2015/04/10 07:26:31 UTC, 1 replies.
- Increase partitions reading Parquet File - posted by Masf <ma...@gmail.com> on 2015/04/10 08:44:37 UTC, 4 replies.
- Spark SQL or rules hot reload - posted by Bruce Dou <do...@gmail.com> on 2015/04/10 08:55:14 UTC, 1 replies.
- How to set timestamp format when giving schema ? - posted by bipin <bi...@gmail.com> on 2015/04/10 09:03:10 UTC, 0 replies.
- Re: saveAsTable fails to save RDD in Spark SQL 1.3.0 - posted by rahulkumar-aws <ra...@gmail.com> on 2015/04/10 11:06:32 UTC, 1 replies.
- Unit tests with spark local[4] on Jenkins - posted by Eugene Morozov <fa...@list.ru> on 2015/04/10 11:17:02 UTC, 1 replies.
- Spark Streaming not picking current date properly - posted by Anshul Singhle <an...@betaglide.com> on 2015/04/10 12:52:29 UTC, 1 replies.
- coalesce(*, false) problem - posted by 邓刚, , 技术中心, , tr...@vipshop.com on 2015/04/10 14:16:12 UTC, 1 replies.
- Yarn application state monitor thread dying on IOException - posted by Lorenz Knies <me...@l1024.org> on 2015/04/10 14:24:48 UTC, 2 replies.
- truncate double value to two decimal in rdd[Double] - posted by Hafiz Mujadid <ha...@gmail.com> on 2015/04/10 15:17:28 UTC, 0 replies.
- Spark + Hbase - posted by Siddharth Ubale <si...@syncoms.com> on 2015/04/10 15:40:20 UTC, 0 replies.
- Running jobs on Spark which is build by myself fails :( - posted by 林晨 <be...@gmail.com> on 2015/04/10 15:46:06 UTC, 0 replies.
- not found: value SQLContextSingleton - posted by "Mukund Ranjan (muranjan)" <mu...@cisco.com> on 2015/04/10 15:47:13 UTC, 1 replies.
- Re: truncate double value to two decimal in rdd[Double] - posted by Dean Wampler <de...@gmail.com> on 2015/04/10 15:59:41 UTC, 0 replies.
- Spark SQL: equality with binary columns - posted by James Aley <ja...@swiftkey.com> on 2015/04/10 17:17:44 UTC, 1 replies.
- Benchmaking col vs row similarities - posted by Debasish Das <de...@gmail.com> on 2015/04/10 17:32:58 UTC, 3 replies.
- Pros and Cons of Spark Batch over Streaming for processing data queried from Elastic Search - posted by Hari Polisetty <hp...@icloud.com> on 2015/04/10 18:29:33 UTC, 2 replies.
- executor lost failure and update spark.executor.memory - posted by Peng Xia <sp...@gmail.com> on 2015/04/10 18:31:17 UTC, 0 replies.
- regarding reduce function - posted by Jeetendra Gangele <ga...@gmail.com> on 2015/04/10 19:12:42 UTC, 1 replies.
- [Spark1.3] UDF registration issue - posted by Yana Kadiyska <ya...@gmail.com> on 2015/04/10 20:30:55 UTC, 2 replies.
- Getting outofmemory errors on spark - posted by Anshul Singhle <an...@betaglide.com> on 2015/04/10 21:48:56 UTC, 0 replies.
- The $ notation for DataFrame Column - posted by Justin Yip <yi...@prediction.io> on 2015/04/10 23:08:58 UTC, 0 replies.
- How to use the --files arg - posted by Udit Mehta <um...@groupon.com> on 2015/04/10 23:56:07 UTC, 0 replies.
- DataFrame column name restriction - posted by Justin Yip <yi...@prediction.io> on 2015/04/11 04:18:51 UTC, 1 replies.
- foreach going in infinite loop - posted by Jeetendra Gangele <ga...@gmail.com> on 2015/04/11 04:34:31 UTC, 0 replies.
- Unusual behavior with leftouterjoin - posted by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com on 2015/04/11 13:20:16 UTC, 1 replies.
- Taks going into NODE_LOCAL at beginning of job - posted by Jeetendra Gangele <ga...@gmail.com> on 2015/04/11 16:45:35 UTC, 0 replies.
- Re: Spark on Mesos / Executor Memory - posted by Tim Chen <ti...@mesosphere.io> on 2015/04/11 19:29:56 UTC, 0 replies.
- Spark support for Hadoop Formats (Avro) - posted by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com on 2015/04/12 05:44:07 UTC, 2 replies.
- regarding ZipWithIndex - posted by Jeetendra Gangele <ga...@gmail.com> on 2015/04/12 22:16:19 UTC, 11 replies.
- counters in spark - posted by Grandl Robert <rg...@yahoo.com.INVALID> on 2015/04/13 00:32:59 UTC, 2 replies.
- How to do dispatching in Streaming? - posted by Jianshi Huang <ji...@gmail.com> on 2015/04/13 04:52:33 UTC, 8 replies.
- Manning looking for a co-author for the GraphX in Action book - posted by Reynold Xin <rx...@databricks.com> on 2015/04/13 09:32:47 UTC, 0 replies.
- Spark Cluster: RECEIVED SIGNAL 15: SIGTERM - posted by James King <ja...@gmail.com> on 2015/04/13 10:55:15 UTC, 3 replies.
- Re: Problem getting program to run on 15TB input - posted by Daniel Mahler <dm...@gmail.com> on 2015/04/13 12:12:06 UTC, 0 replies.
- Exception"Driver-Memory" while running Spark job on Yarn-cluster - posted by sachin Singh <sa...@gmail.com> on 2015/04/13 12:23:15 UTC, 1 replies.
- Re: Parquet File Binary column statistics error when reuse byte[] among rows - posted by Cheng Lian <li...@gmail.com> on 2015/04/13 12:34:55 UTC, 0 replies.
- Reading files from http server - posted by Peter Rudenko <pe...@gmail.com> on 2015/04/13 12:55:42 UTC, 0 replies.
- How to use multiple app jar files? - posted by Michael Weir <mi...@gmail.com> on 2015/04/13 13:15:59 UTC, 2 replies.
- Re: MLlib : Gradient Boosted Trees classification confidence - posted by mike <mi...@gmail.com> on 2015/04/13 14:08:04 UTC, 0 replies.
- Sqoop parquet file not working in spark - posted by bipin <bi...@gmail.com> on 2015/04/13 14:34:17 UTC, 0 replies.
- Packaging Java + Python library - posted by Punya Biswal <pb...@palantir.com> on 2015/04/13 15:41:29 UTC, 1 replies.
- Configuring amount of disk space available to spark executors in mesos? - posted by Jonathan Coveney <jc...@gmail.com> on 2015/04/13 17:19:06 UTC, 2 replies.
- Equi Join is taking for ever. 1 Task is Running while other 199 are complete - posted by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com on 2015/04/13 17:32:58 UTC, 8 replies.
- Spark Streaming Kafka Consumer, Confluent Platform, Avro & StorageLevel - posted by Nicolas Phung <ni...@gmail.com> on 2015/04/13 17:38:46 UTC, 0 replies.
- What's the cleanest way to make spark aware of my custom scheduler? - posted by Jonathan Coveney <jc...@gmail.com> on 2015/04/13 17:44:55 UTC, 0 replies.
- Help in transforming the RDD - posted by Jeetendra Gangele <ga...@gmail.com> on 2015/04/13 18:08:53 UTC, 0 replies.
- Multiple Kafka Recievers - posted by Laeeq Ahmed <la...@yahoo.com.INVALID> on 2015/04/13 18:31:32 UTC, 1 replies.
- Need some guidance - posted by Marco Shaw <ma...@gmail.com> on 2015/04/13 18:45:32 UTC, 4 replies.
- Multipart upload to S3 fails with Bad Digest Exceptions - posted by Eugen Cepoi <ce...@gmail.com> on 2015/04/13 18:46:53 UTC, 1 replies.
- feature scaling in GeneralizedLinearAlgorithm.scala - posted by Jianguo Li <fl...@gmail.com> on 2015/04/13 18:58:21 UTC, 1 replies.
- How to load avro file into spark not on Hadoop in pyspark? - posted by sa <as...@gmail.com> on 2015/04/13 22:22:06 UTC, 1 replies.
- Rack locality - posted by rcharaya <ri...@gmail.com> on 2015/04/13 23:09:11 UTC, 1 replies.
- Spark Worker IP Error - posted by DStrip <d....@hotmail.com> on 2015/04/14 01:09:13 UTC, 0 replies.
- Registering classes with KryoSerializer - posted by Arun Lists <li...@gmail.com> on 2015/04/14 01:09:33 UTC, 5 replies.
- How to access postgresql on Spark SQL - posted by do...@sina.com on 2015/04/14 03:07:45 UTC, 1 replies.
- sbt-assembly spark-streaming-kinesis-asl error - posted by Mike Trienis <mi...@orcsol.com> on 2015/04/14 03:36:31 UTC, 8 replies.
- Re: Task result in Spark Worker Node - posted by Imran Rashid <ir...@cloudera.com> on 2015/04/14 04:22:26 UTC, 3 replies.
- Re: How to get rdd count() without double evaluation of the RDD? - posted by Imran Rashid <ir...@cloudera.com> on 2015/04/14 04:28:20 UTC, 0 replies.
- Re: Understanding Spark Memory distribution - posted by Imran Rashid <ir...@cloudera.com> on 2015/04/14 05:09:24 UTC, 1 replies.
- How can I add my custom Rule to spark sql? - posted by Andy Zhao <an...@gmail.com> on 2015/04/14 05:52:42 UTC, 1 replies.
- Re: Spark sql failed in yarn-cluster mode when connecting to non-default hive database - posted by sachin Singh <sa...@gmail.com> on 2015/04/14 07:23:58 UTC, 0 replies.
- Re: SparkSQL + Parquet performance - posted by Akhil Das <ak...@sigmoidanalytics.com> on 2015/04/14 08:40:28 UTC, 0 replies.
- Re: streamSQL - is it available or is it in POC ? - posted by Akhil Das <ak...@sigmoidanalytics.com> on 2015/04/14 09:00:58 UTC, 0 replies.
- Running Spark on Gateway - Connecting to Resource Manager Retries - posted by Vineet Mishra <cl...@gmail.com> on 2015/04/14 09:05:38 UTC, 4 replies.
- Re: save as text file throwing null pointer error. - posted by Akhil Das <ak...@sigmoidanalytics.com> on 2015/04/14 09:10:58 UTC, 1 replies.
- Cannot saveAsParquetFile from a RDD of case class - posted by pishen <pi...@gmail.com> on 2015/04/14 09:18:35 UTC, 4 replies.
- Spark: Using "node-local" files within functions? - posted by "Horsmann, Tobias" <to...@uni-due.de> on 2015/04/14 10:39:00 UTC, 1 replies.
- Spark SQL reading parquet decimal - posted by Clint McNeil <cl...@impactradius.com> on 2015/04/14 10:56:55 UTC, 2 replies.
- Spark Data Formats ? - posted by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com on 2015/04/14 11:29:13 UTC, 2 replies.
- Clustering users according to their shopping traits - posted by Zork Sail <zo...@gmail.com> on 2015/04/14 12:51:28 UTC, 0 replies.
- Spark Yarn-client Kerberos on remote cluster - posted by philippe L <la...@gmail.com> on 2015/04/14 12:54:17 UTC, 1 replies.
- Spark 1.2, trying to run spark-history as a service, spark-defaults.conf are ignored - posted by Serega Sheypak <se...@gmail.com> on 2015/04/14 15:42:39 UTC, 0 replies.
- How to join RDD keyValuePairs efficiently - posted by "Wang, Ningjun (LNG-NPV)" <ni...@lexisnexis.com> on 2015/04/14 16:40:59 UTC, 8 replies.
- How DataFrame schema migration works ? - posted by Jaonary Rabarisoa <ja...@gmail.com> on 2015/04/14 17:00:54 UTC, 1 replies.
- Saving RDDs as custom output format - posted by Daniel Haviv <da...@veracity-group.com> on 2015/04/14 17:59:01 UTC, 1 replies.
- Re: Spark SQL 1.3.1 "saveAsParquetFile" will output tachyon file with different block size - posted by Cheng Lian <li...@gmail.com> on 2015/04/14 18:57:24 UTC, 0 replies.
- Converting Date pattern in scala code - posted by "BASAK, ANANDA" <ab...@att.com> on 2015/04/14 20:03:33 UTC, 1 replies.
- park-assembly-1.3.0-hadoop2.3.0.jar has unsigned entries - org/apache/spark/SparkHadoopWriter$.class - posted by Manoj Samel <ma...@gmail.com> on 2015/04/14 20:10:25 UTC, 0 replies.
- spark streaming printing no output - posted by Shushant Arora <sh...@gmail.com> on 2015/04/14 20:11:31 UTC, 9 replies.
- TaskResultLost - posted by Pat Ferrel <pa...@occamsmachete.com> on 2015/04/14 20:35:43 UTC, 0 replies.
- Is it feasible to keep millions of keys in state of Spark Streaming job for two months? - posted by Krzysztof Zarzycki <k....@gmail.com> on 2015/04/14 21:34:00 UTC, 3 replies.
- spark ml model info - posted by Jianguo Li <fl...@gmail.com> on 2015/04/14 22:22:29 UTC, 1 replies.
- Help understanding the FP-Growth algrithm - posted by Eric Tanner <er...@justenough.com> on 2015/04/14 22:27:24 UTC, 1 replies.
- Catching executor exception from executor in driver - posted by Justin Yip <yi...@prediction.io> on 2015/04/14 23:46:52 UTC, 1 replies.
- SparkSQL JDBC Datasources API when running on YARN - Spark 1.3.0 - posted by Nathan McCarthy <Na...@quantium.com.au> on 2015/04/15 05:57:28 UTC, 6 replies.
- Parquet Partition Size are different when using Dataframe's save append funciton - posted by 顾亮亮 <gu...@qiyi.com> on 2015/04/15 08:00:19 UTC, 0 replies.
- Do multiple ipython notebooks work on yarn in one cluster? - posted by aihe <he...@gmail.com> on 2015/04/15 08:09:48 UTC, 0 replies.
- OOM in SizeEstimator while using combineByKey - posted by Aniket Bhatnagar <an...@gmail.com> on 2015/04/15 09:10:26 UTC, 2 replies.
- Running beyond physical memory limits - posted by Brahma Reddy Battula <br...@huawei.com> on 2015/04/15 10:59:21 UTC, 6 replies.
- spark streaming with kafka - posted by Shushant Arora <sh...@gmail.com> on 2015/04/15 12:16:53 UTC, 3 replies.
- Save org.apache.spark.mllib.linalg.Matri to a file - posted by Spico Florin <sp...@gmail.com> on 2015/04/15 14:16:35 UTC, 1 replies.
- How to get a clean DataFrame schema merge - posted by Jaonary Rabarisoa <ja...@gmail.com> on 2015/04/15 14:59:01 UTC, 1 replies.
- Execption while using kryo with broadcast - posted by Jeetendra Gangele <ga...@gmail.com> on 2015/04/15 15:08:48 UTC, 7 replies.
- Spark 1.3 saveAsTextFile with codec gives error - works with Spark 1.2 - posted by Manoj Samel <ma...@gmail.com> on 2015/04/15 18:03:08 UTC, 1 replies.
- Re: multinomial and Bernoulli model in NaiveBayes - posted by Xiangrui Meng <me...@databricks.com> on 2015/04/15 19:17:50 UTC, 0 replies.
- exception during foreach run - posted by Jeetendra Gangele <ga...@gmail.com> on 2015/04/15 19:58:53 UTC, 0 replies.
- [SparkSQL; Thriftserver] Help tracking missing 5 minutes - posted by Yana Kadiyska <ya...@gmail.com> on 2015/04/15 19:58:57 UTC, 0 replies.
- adding new elements to batch RDD from DStream RDD - posted by Evo Eftimov <ev...@isecc.com> on 2015/04/15 20:37:22 UTC, 7 replies.
- RAM management during cogroup and join - posted by Evo Eftimov <ev...@isecc.com> on 2015/04/15 22:11:02 UTC, 6 replies.
- Passing Elastic Search Mappings in Spark Conf - posted by Deepak Subhramanian <de...@gmail.com> on 2015/04/16 01:12:45 UTC, 2 replies.
- aliasing aggregate columns? - posted by elliott cordo <el...@gmail.com> on 2015/04/16 01:23:06 UTC, 2 replies.
- Dataset announcement - posted by Olivier Chapelle <ol...@chapelle.cc> on 2015/04/16 02:58:14 UTC, 3 replies.
- Re: Can Spark 1.0.2 run on CDH-4.3.0 with yarn? And Will Spark 1.2.0 support CDH5.1.2 with yarn? - posted by Canoe <ca...@gmail.com> on 2015/04/16 07:39:04 UTC, 0 replies.
- Spark SQL query key/value in Map - posted by "jc.francisco" <jc...@gmail.com> on 2015/04/16 09:37:57 UTC, 2 replies.
- Streaming problems running 24x7 - posted by Miquel <mi...@tecsidel.es> on 2015/04/16 10:39:24 UTC, 7 replies.
- Re: executor failed, cannot find compute-classpath.sh - posted by TimMalt <dw...@gmx.net> on 2015/04/16 11:37:11 UTC, 0 replies.
- custom input format in spark - posted by Shushant Arora <sh...@gmail.com> on 2015/04/16 12:48:00 UTC, 3 replies.
- ClassCastException processing date fields using spark SQL since 1.3.0 - posted by rkrist <rk...@vub.sk> on 2015/04/16 12:54:53 UTC, 10 replies.
- MLLib SVMWithSGD : java.lang.OutOfMemoryError: Java heap space - posted by sarath <sa...@gmail.com> on 2015/04/16 14:39:37 UTC, 1 replies.
- [SQL] DROP TABLE should also uncache table - posted by Arush Kharbanda <ar...@sigmoidanalytics.com> on 2015/04/16 14:42:13 UTC, 3 replies.
- [ThriftServer] Urgent -- very slow Metastore query from Spark - posted by Yana Kadiyska <ya...@gmail.com> on 2015/04/16 16:33:53 UTC, 0 replies.
- Problem with Spark SQL UserDefinedType and sbt assembly - posted by Jaonary Rabarisoa <ja...@gmail.com> on 2015/04/16 17:04:29 UTC, 3 replies.
- Data partitioning and node tracking in Spark-GraphX - posted by mas <ma...@gmail.com> on 2015/04/16 17:07:52 UTC, 5 replies.
- dataframe can not find fields after loading from hive - posted by Cesar Flores <ce...@gmail.com> on 2015/04/16 18:17:58 UTC, 3 replies.
- Spark on Windows - posted by Arun Lists <li...@gmail.com> on 2015/04/16 18:23:51 UTC, 7 replies.
- Distinct is very slow - posted by Jeetendra Gangele <ga...@gmail.com> on 2015/04/16 18:26:37 UTC, 12 replies.
- Custom partioner - posted by Jeetendra Gangele <ga...@gmail.com> on 2015/04/16 18:46:47 UTC, 5 replies.
- General configurations on CDH5 to achieve maximum Spark Performance - posted by Manish Gupta 8 <mg...@sapient.com> on 2015/04/16 19:02:37 UTC, 5 replies.
- saveAsTextFile - posted by Vadim Bichutskiy <va...@gmail.com> on 2015/04/16 19:32:30 UTC, 9 replies.
- Random pairs / RDD order - posted by abellet <au...@telecom-paristech.fr> on 2015/04/16 20:00:12 UTC, 6 replies.
- StackOverflowError from KafkaReceiver when rate limiting used - posted by Jeff Nadler <jn...@srcginc.com> on 2015/04/16 20:31:16 UTC, 1 replies.
- spark.dynamicAllocation.minExecutors - posted by Michael Stone <ms...@mathom.us> on 2015/04/16 20:41:48 UTC, 7 replies.
- MLlib - Naive Bayes Problem - posted by riginos <sa...@gmail.com> on 2015/04/16 22:00:48 UTC, 1 replies.
- When querying ElasticSearch, score is 0 - posted by Andrejs Abele <an...@insight-centre.org> on 2015/04/16 22:45:04 UTC, 2 replies.
- AMP Lab Indexed RDD - question for Data Bricks AMP Labs - posted by Evo Eftimov <ev...@isecc.com> on 2015/04/16 23:00:17 UTC, 3 replies.
- dataframe call, how to control number of tasks for a stage - posted by Neal Yin <ne...@workday.com> on 2015/04/17 00:04:16 UTC, 0 replies.
- Re: dataframe call, how to control number of tasks for a stage - posted by Yin Huai <yh...@databricks.com> on 2015/04/17 00:41:18 UTC, 0 replies.
- mapPartitions() in Java 8 - posted by samirissa00 <sa...@gmail.com> on 2015/04/17 00:57:03 UTC, 0 replies.
- Python 3 support for PySpark has been merged into master - posted by Josh Rosen <ro...@gmail.com> on 2015/04/17 01:46:26 UTC, 0 replies.
- SparkR: Server IPC version 9 cannot communicate with client version 4 - posted by "lalasriza ." <la...@gmail.com> on 2015/04/17 01:47:26 UTC, 1 replies.
- Base metrics for Spark Benchmarking. - posted by Bijay Pathak <bi...@cloudwick.com> on 2015/04/17 02:10:48 UTC, 0 replies.
- Tuple join - posted by Flavio Pompermaier <po...@okkam.it> on 2015/04/17 08:37:25 UTC, 0 replies.
- Spark Directed Acyclic Graph / Jobs - posted by James King <ja...@gmail.com> on 2015/04/17 09:26:08 UTC, 2 replies.
- about understanding web ui - posted by Hoai-Thu Vuong <th...@gmail.com> on 2015/04/17 09:31:05 UTC, 3 replies.
- SQL UserDefinedType can't be saved in parquet file when using assembly jar - posted by Jaonary Rabarisoa <ja...@gmail.com> on 2015/04/17 09:45:27 UTC, 1 replies.
- Path issue in running spark - posted by mas <ma...@gmail.com> on 2015/04/17 10:22:39 UTC, 0 replies.
- Some questions on Multiple Streams - posted by Laeeq Ahmed <la...@yahoo.com.INVALID> on 2015/04/17 11:36:56 UTC, 4 replies.
- Executor memory in web UI - posted by podioss <gr...@hotmail.com> on 2015/04/17 12:30:53 UTC, 1 replies.
- Addition of new Metrics for killed executors. - posted by Archit Thakur <ar...@gmail.com> on 2015/04/17 12:37:54 UTC, 3 replies.
- ClassCastException while caching a query - posted by Tash Chainar <tc...@gmail.com> on 2015/04/17 12:52:00 UTC, 0 replies.
- SparkStreaming 1.3.0 fileNotFound Exception while using WAL & Checkpoints - posted by Akhil Das <ak...@sigmoidanalytics.com> on 2015/04/17 13:24:26 UTC, 0 replies.
- Streaming Linear Regression problem - posted by barisak <ba...@gmail.com> on 2015/04/17 13:56:57 UTC, 2 replies.
- Re: Joined RDD - posted by Archit Thakur <ar...@gmail.com> on 2015/04/17 13:58:34 UTC, 1 replies.
- Running into several problems with Data Frames - posted by Darin McBeath <dd...@yahoo.com.INVALID> on 2015/04/17 16:04:05 UTC, 0 replies.
- history-server does't read logs which are on FS - posted by Serega Sheypak <se...@gmail.com> on 2015/04/17 17:13:03 UTC, 2 replies.
- Which version of Hive QL is Spark 1.3.0 using? - posted by ARose <As...@telarix.com> on 2015/04/17 18:18:08 UTC, 1 replies.
- local directories for spark running on yarn - posted by shenyanls <sh...@gmail.com> on 2015/04/17 18:18:33 UTC, 1 replies.
- How to persist RDD return from partitionBy() to disk? - posted by "Wang, Ningjun (LNG-NPV)" <ni...@lexisnexis.com> on 2015/04/17 18:35:36 UTC, 1 replies.
- When are TaskCompletionListeners called? - posted by Akshat Aranya <aa...@gmail.com> on 2015/04/17 19:23:08 UTC, 1 replies.
- Spark hanging after main method completes - posted by apropion <ca...@yahoo.com> on 2015/04/17 19:57:07 UTC, 1 replies.
- Metrics Servlet on spark 1.2 - posted by Udit Mehta <um...@groupon.com> on 2015/04/17 20:22:47 UTC, 0 replies.
- Re: Spark Code to read RCFiles - posted by gle <gl...@gmail.com> on 2015/04/17 21:25:43 UTC, 1 replies.
- Need Costom RDD - posted by Jeetendra Gangele <ga...@gmail.com> on 2015/04/17 22:02:11 UTC, 0 replies.
- Re: Why does the HDFS parquet file generated by Spark SQL have different size with those on Tachyon? - posted by Reynold Xin <rx...@databricks.com> on 2015/04/17 22:55:53 UTC, 0 replies.
- Re: HDP 2.2 AM abort : Unable to find ExecutorLauncher class - posted by Udit Mehta <um...@groupon.com> on 2015/04/17 23:01:24 UTC, 8 replies.
- How to avoid “Invalid checkpoint directory” error in apache Spark? - posted by Peng Cheng <pc...@uow.edu.au> on 2015/04/18 00:43:30 UTC, 0 replies.
- Announcing Spark 1.3.1 and 1.2.2 - posted by Patrick Wendell <pw...@gmail.com> on 2015/04/18 00:53:47 UTC, 0 replies.
- Can't get SparkListener to work - posted by Praveen Balaji <se...@gmail.com> on 2015/04/18 00:54:24 UTC, 8 replies.
- Verifying multiple workers are being used - posted by mj...@columbus.rr.com on 2015/04/18 04:19:12 UTC, 0 replies.
- spark application was submitted twice unexpectedly - posted by Pengcheng Liu <pc...@gmail.com> on 2015/04/18 10:30:35 UTC, 1 replies.
- MLlib -Collaborative Filtering - posted by riginos <sa...@gmail.com> on 2015/04/18 11:08:41 UTC, 4 replies.
- spark with kafka - posted by Shushant Arora <sh...@gmail.com> on 2015/04/18 12:43:01 UTC, 9 replies.
- Number of input partitions in SparkContext.sequenceFile - posted by Wenlei Xie <we...@gmail.com> on 2015/04/18 14:50:53 UTC, 2 replies.
- Does reduceByKey only work properly for numeric keys? - posted by SecondDatke <lo...@outlook.com> on 2015/04/18 17:17:47 UTC, 7 replies.
- newAPIHadoopRDD file name - posted by Manas Kar <ma...@gmail.com> on 2015/04/18 18:36:53 UTC, 2 replies.
- shuffle.FetchFailedException in spark on YARN job - posted by roy <rp...@njit.edu> on 2015/04/18 19:44:50 UTC, 2 replies.
- Can a map function return null - posted by Steve Lewis <lo...@gmail.com> on 2015/04/18 20:43:45 UTC, 5 replies.
- Dataframes Question - posted by Arun Patel <ar...@gmail.com> on 2015/04/19 01:43:55 UTC, 4 replies.
- spark sql error with proto/parquet - posted by "Abhishek R. Singh" <ab...@tetrationanalytics.com> on 2015/04/19 02:40:17 UTC, 1 replies.
- Spark-csv data source: infer data types - posted by Oleg Shirokikh <Ol...@solver.com> on 2015/04/19 06:00:01 UTC, 1 replies.
- Spark Cassandra Connector - posted by DStrip <d....@hotmail.com> on 2015/04/19 06:02:13 UTC, 1 replies.
- Date class not supported by SparkSQL - posted by Lior Chaga <li...@taboola.com> on 2015/04/19 15:27:33 UTC, 1 replies.
- Aggregation by column and generating a json - posted by dsub <de...@gmail.com> on 2015/04/19 20:51:52 UTC, 0 replies.
- Skipped Jobs - posted by James King <ja...@gmail.com> on 2015/04/19 22:42:53 UTC, 3 replies.
- how to make a spark cluster ? - posted by hnahak <ha...@gmail.com> on 2015/04/20 00:11:08 UTC, 4 replies.
- Data frames in GraphX - posted by hnahak <ha...@gmail.com> on 2015/04/20 00:32:27 UTC, 0 replies.
- GraphX: unbalanced computation and slow runtime on livejournal network - posted by harenbergsd <ha...@ncsu.edu> on 2015/04/20 01:02:52 UTC, 2 replies.
- compliation error - posted by Brahma Reddy Battula <br...@huawei.com> on 2015/04/20 04:25:24 UTC, 6 replies.
- Code Deployment tools in Production - posted by Arun Patel <ar...@gmail.com> on 2015/04/20 04:44:44 UTC, 1 replies.
- [STREAMING KAFKA - Direct Approach] JavaPairRDD cannot be cast to HasOffsetRanges - posted by RimBerry <tr...@gmail.com> on 2015/04/20 05:41:52 UTC, 1 replies.
- SparkStreaming onStart not being invoked on CustomReceiver attached to master with multiple workers - posted by Ankit Patel <pa...@hotmail.com> on 2015/04/20 06:13:25 UTC, 5 replies.
- sparksql - HiveConf not found during task deserialization - posted by Manku Timma <ma...@gmail.com> on 2015/04/20 08:22:09 UTC, 9 replies.
- How to run spark programs in eclipse like mapreduce - posted by sandeep vura <sa...@gmail.com> on 2015/04/20 08:44:03 UTC, 3 replies.
- Running spark over HDFS - posted by madhvi <ma...@orkash.com> on 2015/04/20 08:52:12 UTC, 12 replies.
- NEWBIE/not able to connect to postgresql using jdbc - posted by shashanksoni <sh...@gmail.com> on 2015/04/20 08:56:49 UTC, 1 replies.
- SparkSQL performance - posted by Renato Marroquín Mogrovejo <re...@gmail.com> on 2015/04/20 09:31:56 UTC, 7 replies.
- [pyspark] Starting workers in a virtualenv - posted by Karlson <ks...@siberie.de> on 2015/04/20 09:51:20 UTC, 0 replies.
- Spark-1.2.2-bin-hadoop2.4.tgz missing - posted by Zsolt Tóth <to...@gmail.com> on 2015/04/20 09:59:03 UTC, 2 replies.
- Order of execution of tasks inside of a stage and computing the number of stages - posted by Spico Florin <sp...@gmail.com> on 2015/04/20 12:24:57 UTC, 0 replies.
- mapPartitions vs foreachPartition - posted by Arun Patel <ar...@gmail.com> on 2015/04/20 12:35:48 UTC, 3 replies.
- writing to hdfs on master node much faster - posted by jamborta <ja...@gmail.com> on 2015/04/20 13:21:30 UTC, 3 replies.
- Spark 1.3.1 - SQL Issues - posted by ayan guha <gu...@gmail.com> on 2015/04/20 14:23:17 UTC, 0 replies.
- When the old data dropped from the cache? - posted by Tash Chainar <tc...@gmail.com> on 2015/04/20 15:07:34 UTC, 1 replies.
- Custom Partitioning Spark - posted by mas <ma...@gmail.com> on 2015/04/20 15:56:43 UTC, 3 replies.
- Equal number of RDD Blocks - posted by Laeeq Ahmed <la...@yahoo.com.INVALID> on 2015/04/20 16:15:09 UTC, 6 replies.
- Configuring logging properties for executor - posted by Michael Ryabtsev <mi...@gmail.com> on 2015/04/20 16:26:14 UTC, 5 replies.
- Understanding the build params for spark with sbt. - posted by Shiyao Ma <i...@introo.me> on 2015/04/20 16:40:29 UTC, 2 replies.
- Fail to read files from s3 after upgrading to 1.3 - posted by Ophir Cohen <op...@gmail.com> on 2015/04/20 16:43:11 UTC, 2 replies.
- Issue of running partitioned loading (RDD) in Spark External Datasource on Mesos - posted by Yang Lei <ge...@gmail.com> on 2015/04/20 16:46:49 UTC, 0 replies.
- Spark SQL vs map reduce tableInputOutput - posted by Jeetendra Gangele <ga...@gmail.com> on 2015/04/20 16:54:42 UTC, 4 replies.
- How can i custom the external initialize when start the spark cluster - posted by 年轻的信赖 <17...@qq.com> on 2015/04/20 17:44:06 UTC, 0 replies.
- Unsupported types in org.apache.spark.sql.jdbc.JDBCRDD$.getCatalystType - posted by ARose <As...@telarix.com> on 2015/04/20 17:58:21 UTC, 0 replies.
- Custom paritioning of DSTream - posted by Evo Eftimov <ev...@isecc.com> on 2015/04/20 18:12:03 UTC, 3 replies.
- Shuffle files not cleaned up (Spark 1.2.1) - posted by N B <nb...@gmail.com> on 2015/04/20 19:42:13 UTC, 7 replies.
- Task not Serializable: Graph is unexpectedly null when DStream is being serialized - posted by Jean-Pascal Billaud <jp...@tellapart.com> on 2015/04/20 19:44:16 UTC, 8 replies.
- Re: Did anybody run Spark-perf on powerpc? - posted by zapstar <me...@thirumal.in> on 2015/04/20 19:50:03 UTC, 1 replies.
- Spark metrics source - posted by Udit Mehta <um...@groupon.com> on 2015/04/20 20:26:15 UTC, 0 replies.
- GSSException when submitting Spark job in yarn-cluster mode with HiveContext APIs on Kerberos cluster - posted by Andrew Lee <al...@hotmail.com> on 2015/04/20 22:58:49 UTC, 2 replies.
- HiveContext vs SQLContext - posted by Daniel Mahler <dm...@gmail.com> on 2015/04/20 23:17:06 UTC, 2 replies.
- Instantiating/starting Spark jobs programmatically - posted by Ajay Singal <as...@gmail.com> on 2015/04/20 23:29:53 UTC, 5 replies.
- ChiSquared Test from user response flat files to RDD[Vector]? - posted by "Dan DeCapria, CivicScience" <da...@civicscience.com> on 2015/04/20 23:32:52 UTC, 1 replies.
- Updating a Column in a DataFrame - posted by ARose <As...@telarix.com> on 2015/04/21 00:38:52 UTC, 2 replies.
- Why is Columnar Parquet used as default for saving Row-based DataFrames/RDD? - posted by Duy Lan Nguyen <nd...@gmail.com> on 2015/04/21 01:08:24 UTC, 1 replies.
- Re: Spark Performance on Yarn - posted by Peng Cheng <pc...@uow.edu.au> on 2015/04/21 01:17:08 UTC, 4 replies.
- MLlib - Collaborative Filtering - trainImplicit task size - posted by "Christian S. Perone" <ch...@gmail.com> on 2015/04/21 03:56:35 UTC, 5 replies.
- meet weird exception when studying rdd caching - posted by donhoff_h <16...@qq.com> on 2015/04/21 04:39:29 UTC, 1 replies.
- invalidate caching for hadoopFile input? - posted by Wei Wei <vi...@gmail.com> on 2015/04/21 05:15:28 UTC, 1 replies.
- Map-Side Join in Spark - posted by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com on 2015/04/21 06:10:00 UTC, 8 replies.
- Spark and accumulo - posted by madhvi <ma...@orkash.com> on 2015/04/21 06:25:21 UTC, 2 replies.
- WebUI shows poor locality when task schduling - posted by eric wong <wi...@gmail.com> on 2015/04/21 09:00:33 UTC, 1 replies.
- Different behavioural of HiveContext vs. Hive? - posted by Ophir Cohen <op...@gmail.com> on 2015/04/21 09:49:31 UTC, 1 replies.
- Features scaling - posted by Denys Kozyr <dk...@gmail.com> on 2015/04/21 10:00:29 UTC, 1 replies.
- Column renaming after DataFrame.groupBy - posted by Justin Yip <yi...@prediction.io> on 2015/04/21 10:06:32 UTC, 2 replies.
- Spark Streaming updatyeStateByKey throws OutOfMemory Error - posted by Sourav Chandra <so...@livestream.com> on 2015/04/21 11:23:17 UTC, 6 replies.
- Compression and Hive with Spark 1.3 - posted by Ophir Cohen <op...@gmail.com> on 2015/04/21 11:40:49 UTC, 1 replies.
- Cassandra Connection Issue with Spark-jobserver - posted by Anand <an...@monotype.com> on 2015/04/21 12:10:13 UTC, 2 replies.
- Meet Exception when learning Broadcast Variables - posted by donhoff_h <16...@qq.com> on 2015/04/21 12:19:05 UTC, 2 replies.
- SPOF in Spark driver in yarn-client mode. - posted by "guoqing0629@yahoo.com.hk" <gu...@yahoo.com.hk> on 2015/04/21 12:21:23 UTC, 0 replies.
- Spark Unit Testing - posted by James King <ja...@gmail.com> on 2015/04/21 13:26:39 UTC, 2 replies.
- UserDefinedTypes for SparkSQL Pitfalls (solved) - posted by kmader <ke...@gmail.com> on 2015/04/21 13:55:25 UTC, 0 replies.
- Spark 1.3.1 Dataframe breaking ALS.train? - posted by ayan guha <gu...@gmail.com> on 2015/04/21 13:58:00 UTC, 4 replies.
- Problem with using Spark ML - posted by Staffan <st...@gmail.com> on 2015/04/21 14:15:50 UTC, 2 replies.
- Shuffle question - posted by Marius Danciu <ma...@gmail.com> on 2015/04/21 14:38:25 UTC, 3 replies.
- Spark REPL no progress when run in cluster - posted by bipin <bi...@gmail.com> on 2015/04/21 14:56:50 UTC, 1 replies.
- mapred.reduce.tasks - posted by Shushant Arora <sh...@gmail.com> on 2015/04/21 14:57:01 UTC, 0 replies.
- StandardScaler failing with OOM errors in PySpark - posted by rok <ro...@gmail.com> on 2015/04/21 15:59:01 UTC, 5 replies.
- Join on DataFrames from the same source (Pyspark) - posted by Karlson <ks...@siberie.de> on 2015/04/21 16:42:58 UTC, 6 replies.
- implicits is not a member of org.apache.spark.sql.SQLContext - posted by "Wang, Ningjun (LNG-NPV)" <ni...@lexisnexis.com> on 2015/04/21 16:54:10 UTC, 2 replies.
- Clustering algorithms in Spark - posted by Jeetendra Gangele <ga...@gmail.com> on 2015/04/21 17:21:20 UTC, 2 replies.
- Error while running SparkPi in Hadoop HA - posted by "Fernando O." <fo...@gmail.com> on 2015/04/21 19:03:37 UTC, 1 replies.
- Spark Scala Version? - posted by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com on 2015/04/21 19:26:23 UTC, 1 replies.
- How does GraphX stores the routing table? - posted by mas <ma...@gmail.com> on 2015/04/21 19:39:50 UTC, 2 replies.
- Multiple HA spark clusters managed by 1 ZK cluster? - posted by Michal Klos <mi...@gmail.com> on 2015/04/21 20:32:21 UTC, 5 replies.
- problem writing to s3 - posted by Daniel Mahler <dm...@gmail.com> on 2015/04/22 01:15:13 UTC, 6 replies.
- Efficient saveAsTextFile by key, directory for each key? - posted by Arun Luthra <ar...@gmail.com> on 2015/04/22 02:45:51 UTC, 1 replies.
- Throw Error: Delegation Token can be issued only with kerberos or web authentication when use saveAsNewAPIHadoopFile - posted by yuemeng1 <yu...@huawei.com> on 2015/04/22 04:41:33 UTC, 0 replies.
- Not able run multiple tasks in parallel, spark streaming - posted by Abhay Bansal <ab...@gmail.com> on 2015/04/22 05:27:52 UTC, 3 replies.
- Building Spark : Adding new DataType in Catalyst - posted by zia_kayani <zi...@platalytics.com> on 2015/04/22 09:23:44 UTC, 2 replies.
- [MLlib] fail to run word2vec - posted by gm yu <hu...@gmail.com> on 2015/04/22 09:32:30 UTC, 1 replies.
- Hive table creation - possible bug in Spark 1.3? - posted by Ophir Cohen <op...@gmail.com> on 2015/04/22 09:50:29 UTC, 3 replies.
- Error in creating spark RDD - posted by madhvi <ma...@orkash.com> on 2015/04/22 11:12:30 UTC, 4 replies.
- Start ThriftServer Error - posted by Yiannis Gkoufas <jo...@gmail.com> on 2015/04/22 12:22:35 UTC, 3 replies.
- Scheduling across applications - Need suggestion - posted by Arun Patel <ar...@gmail.com> on 2015/04/22 12:28:28 UTC, 2 replies.
- How to write Hive's map(key, value, ...) in Spark SQL DSL - posted by Jianshi Huang <ji...@gmail.com> on 2015/04/22 13:25:58 UTC, 0 replies.
- String literal in dataframe.select(...) - posted by Jianshi Huang <ji...@gmail.com> on 2015/04/22 13:27:55 UTC, 1 replies.
- Spark SQL performance issue. - posted by Nikolay Tikhonov <ti...@gmail.com> on 2015/04/22 13:47:45 UTC, 5 replies.
- Convert DStream to DataFrame - posted by Sergio Jiménez Barrio <dr...@gmail.com> on 2015/04/22 15:03:58 UTC, 8 replies.
- Auto Starting a Spark Job on Cluster Starup - posted by James King <ja...@gmail.com> on 2015/04/22 15:09:41 UTC, 1 replies.
- How to merge two dataframes with same schema - posted by bipin <bi...@gmail.com> on 2015/04/22 15:38:10 UTC, 1 replies.
- Building Spark : Building just one module. - posted by zia_kayani <zi...@platalytics.com> on 2015/04/22 15:42:30 UTC, 1 replies.
- Can I index a column in parquet file to make it join faster - posted by "Wang, Ningjun (LNG-NPV)" <ni...@lexisnexis.com> on 2015/04/22 16:15:10 UTC, 0 replies.
- the indices of SparseVector must be ordered while computing SVD - posted by yaochunnan <ya...@gmail.com> on 2015/04/22 17:26:54 UTC, 1 replies.
- Master <-chatter -> Worker - posted by James King <ja...@gmail.com> on 2015/04/22 17:59:00 UTC, 0 replies.
- Map Question - posted by Vadim Bichutskiy <va...@gmail.com> on 2015/04/22 18:28:55 UTC, 10 replies.
- spark 1.3.1 : unable to access s3n:// urls (no file system for scheme s3n:) - posted by Sujee Maniyam <su...@sujee.net> on 2015/04/22 18:45:10 UTC, 5 replies.
- RDD.filter vs. RDD.join--advice please - posted by hokiegeek2 <so...@gmail.com> on 2015/04/22 19:11:02 UTC, 1 replies.
- Spark streaming action running the same work in parallel - posted by ColinMc <co...@shiftenergy.com> on 2015/04/22 19:16:29 UTC, 2 replies.
- Trouble working with Spark-CSV package (error: object databricks is not a member of package com) - posted by Mohammed Omer <be...@gmail.com> on 2015/04/22 20:01:01 UTC, 3 replies.
- ElasticSearch for Spark times out - posted by Adrian Mocanu <am...@verticalscope.com> on 2015/04/22 20:49:11 UTC, 7 replies.
- Spark SQL: SchemaRDD, DataFrame. Multi-value, Nested attributes - posted by Eugene Morozov <fa...@list.ru> on 2015/04/22 21:15:25 UTC, 0 replies.
- beeline that comes with spark 1.3.0 doesn't work with "--hiveconf" or ''--hivevar" which substitutes variables at hive scripts. - posted by ogoh <ok...@gmail.com> on 2015/04/23 00:04:57 UTC, 0 replies.
- setting cost in linear SVM [Python] - posted by "Pagliari, Roberto" <rp...@appcomsci.com> on 2015/04/23 00:25:06 UTC, 1 replies.
- why does groupByKey return RDD[(K, Iterable[V])] not RDD[(K, CompactBuffer[V])] ? - posted by Hao Ren <in...@gmail.com> on 2015/04/23 00:43:43 UTC, 4 replies.
- spark-ec2 s3a filesystem support and hadoop versions - posted by Daniel Mahler <dm...@gmail.com> on 2015/04/23 01:04:10 UTC, 1 replies.
- Re: [spark-csv] Trouble working with Spark-CSV package (error: object databricks is not a member of package com) (#54) - posted by Mohammed Omer <be...@gmail.com> on 2015/04/23 02:14:31 UTC, 0 replies.
- How to access HBase on Spark SQL - posted by do...@sina.com on 2015/04/23 03:22:24 UTC, 0 replies.
- .toPairDStreamFunctions method not found - posted by avseq <su...@gmail.com> on 2015/04/23 05:22:17 UTC, 0 replies.
- Spark RDD Lifecycle: whether RDD will be reclaimed out of scope - posted by Jeffery <yu...@gmail.com> on 2015/04/23 05:58:48 UTC, 1 replies.
- Re: StackOverflow Error when run ALS with 100 iterations - posted by amghost <zh...@outlook.com> on 2015/04/23 06:03:10 UTC, 1 replies.
- LDA code little error @Xiangrui Meng - posted by buring <qy...@gmail.com> on 2015/04/23 06:34:37 UTC, 1 replies.
- How to debug Spark on Yarn? - posted by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com on 2015/04/23 07:48:25 UTC, 8 replies.
- Loading lots of .parquet files in Spark 1.3.1 (Hadoop 2.4) - posted by cosmincatalin <co...@gmail.com> on 2015/04/23 07:56:38 UTC, 0 replies.
- IOUtils cannot write anything in Spark? - posted by Xi Shen <da...@gmail.com> on 2015/04/23 08:03:47 UTC, 1 replies.
- problem with spark thrift server - posted by "guoqing0629@yahoo.com.hk" <gu...@yahoo.com.hk> on 2015/04/23 08:59:07 UTC, 3 replies.
- spark yarn-cluster job failing in batch processing - posted by sachin Singh <sa...@gmail.com> on 2015/04/23 09:10:48 UTC, 0 replies.
- Pipeline in pyspark - posted by Suraj Shetiya <su...@gmail.com> on 2015/04/23 09:12:05 UTC, 1 replies.
- A Spark Group by is running forever - posted by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com on 2015/04/23 09:53:38 UTC, 2 replies.
- Is a higher-res or vector version of Spark logo available? - posted by Enno Shioji <es...@gmail.com> on 2015/04/23 13:09:41 UTC, 0 replies.
- ML regression - spark context dies without error - posted by jamborta <ja...@gmail.com> on 2015/04/23 13:35:42 UTC, 0 replies.
- Contributors, read me! Updated Contributing to Spark wiki - posted by Sean Owen <so...@cloudera.com> on 2015/04/23 13:47:59 UTC, 0 replies.
- Is there a way to get the list of all jobs? - posted by mkestemont <ma...@hotmail.com> on 2015/04/23 15:07:00 UTC, 0 replies.
- Streaming Kmeans usage in java - posted by Jeetendra Gangele <ga...@gmail.com> on 2015/04/23 15:37:00 UTC, 0 replies.
- How to start Thrift JDBC server as part of standalone spark application? - posted by Vladimir Grigor <vl...@kiosked.com> on 2015/04/23 15:42:48 UTC, 0 replies.
- dynamicAllocation & spark-shell - posted by Michael Stone <ms...@mathom.us> on 2015/04/23 16:57:44 UTC, 1 replies.
- Spark + Hue - posted by "MrAsanjar ." <af...@gmail.com> on 2015/04/23 16:58:20 UTC, 0 replies.
- Tasks run only on one machine - posted by Pat Ferrel <pa...@occamsmachete.com> on 2015/04/23 18:51:26 UTC, 7 replies.
- [Spark Streaming] Help with updateStateByKey() - posted by allonsy <lu...@gmail.com> on 2015/04/23 18:59:29 UTC, 0 replies.
- Slower performance when bigger memory? - posted by Shuai Zheng <sz...@gmail.com> on 2015/04/23 19:14:29 UTC, 7 replies.
- Bug? Can't reference to the column by name after join two DataFrame on a same name key - posted by Shuai Zheng <sz...@gmail.com> on 2015/04/23 20:14:27 UTC, 2 replies.
- Non-Deterministic Graph Building - posted by hokiegeek2 <so...@gmail.com> on 2015/04/23 21:56:17 UTC, 1 replies.
- Question regarding join with multiple columns with pyspark - posted by Ali Bajwa <al...@gmail.com> on 2015/04/23 22:05:21 UTC, 4 replies.
- Getting error running MLlib example with new cluster - posted by Su She <su...@gmail.com> on 2015/04/23 23:10:38 UTC, 3 replies.
- Pyspark where do third parties libraries need to be installed under Yarn-client mode - posted by dusts66 <du...@gmail.com> on 2015/04/23 23:20:23 UTC, 2 replies.
- gridsearch - python - posted by "Pagliari, Roberto" <rp...@appcomsci.com> on 2015/04/23 23:47:25 UTC, 3 replies.
- Understanding Spark/MLlib failures - posted by aleverentz <An...@fico.com> on 2015/04/24 01:11:25 UTC, 5 replies.
- Spark SQL - Setting YARN Classpath for primordial class loader - posted by Night Wolf <ni...@gmail.com> on 2015/04/24 03:38:44 UTC, 3 replies.
- spark 1.3.0 strange log message - posted by Henry Hung <YT...@winbond.com> on 2015/04/24 05:23:06 UTC, 0 replies.
- Re: spark 1.3.0 strange log message - posted by Terry Hole <hu...@gmail.com> on 2015/04/24 05:27:03 UTC, 0 replies.
- Is the Spark-1.3.1 support build with scala 2.8 ? - posted by "guoqing0629@yahoo.com.hk" <gu...@yahoo.com.hk> on 2015/04/24 05:52:33 UTC, 2 replies.
- what is the best way to transfer data from RDBMS to spark? - posted by sequoiadb <ma...@sequoiadb.com> on 2015/04/24 10:14:55 UTC, 3 replies.
- Parquet error reading data that contains array of structs - posted by Jianshi Huang <ji...@gmail.com> on 2015/04/24 11:40:33 UTC, 8 replies.
- Spark Cluster Setup - posted by James King <ja...@gmail.com> on 2015/04/24 12:01:02 UTC, 4 replies.
- Multiclass classification using Ml logisticRegression - posted by Selim Namsi <se...@gmail.com> on 2015/04/24 12:26:59 UTC, 3 replies.
- Does HadoopRDD.zipWithIndex method preserve the order of the input data from Hadoop? - posted by Spico Florin <sp...@gmail.com> on 2015/04/24 15:05:52 UTC, 16 replies.
- Spark RDD sortByKey triggering a new job - posted by Spico Florin <sp...@gmail.com> on 2015/04/24 15:57:33 UTC, 1 replies.
- Disable partition discovery - posted by cosmincatalin <co...@gmail.com> on 2015/04/24 16:38:08 UTC, 1 replies.
- What are the likely causes of org.apache.spark.shuffle.MetadataFetchFailedException: Missing an output location for shuffle? - posted by Peng Cheng <pc...@uow.edu.au> on 2015/04/24 16:59:23 UTC, 0 replies.
- [Ml][Dataframe] Ml pipeline & dataframe repartitioning - posted by Peter Rudenko <pe...@gmail.com> on 2015/04/24 17:20:41 UTC, 3 replies.
- Convert DStream[Long] to Long - posted by Sergio Jiménez Barrio <dr...@gmail.com> on 2015/04/24 19:50:15 UTC, 7 replies.
- tachyon on machines launched with spark-ec2 scripts - posted by Daniel Mahler <dm...@gmail.com> on 2015/04/24 21:21:53 UTC, 1 replies.
- DAG - posted by Giovanni Paolo Gibilisco <gi...@gmail.com> on 2015/04/24 21:59:37 UTC, 2 replies.
- Spark Internal Job Scheduling - posted by Arpit1286 <ar...@ymail.com> on 2015/04/24 22:03:44 UTC, 0 replies.
- indexing an RDD [Python] - posted by "Pagliari, Roberto" <rp...@appcomsci.com> on 2015/04/24 22:51:31 UTC, 3 replies.
- Creating a Row in SparkSQL 1.2 from ArrayList - posted by Wenlei Xie <we...@gmail.com> on 2015/04/24 22:56:41 UTC, 1 replies.
- Spark on Mesos - posted by Stephen Carman <sc...@coldlight.com> on 2015/04/24 23:15:10 UTC, 3 replies.
- Re: StreamingContext.textFileStream issue - posted by Yang Lei <ge...@gmail.com> on 2015/04/24 23:32:11 UTC, 0 replies.
- contributing code - how to test - posted by Deborah Siegel <de...@gmail.com> on 2015/04/25 01:35:50 UTC, 1 replies.
- Re: StreamingContext.textFileStream issue - posted by Prannoy <pr...@sigmoidanalytics.com> on 2015/04/25 01:40:25 UTC, 2 replies.
- ORCFiles - posted by David Mitchell <jd...@gmail.com> on 2015/04/25 02:45:54 UTC, 1 replies.
- Customized Aggregation Query on Spark SQL - posted by Wenlei Xie <we...@gmail.com> on 2015/04/25 06:32:14 UTC, 2 replies.
- How can I retrieve item-pair after calculating similarity using RowMatrix - posted by amghost <zh...@outlook.com> on 2015/04/25 08:20:50 UTC, 2 replies.
- KMeans takeSample jobs and RDD cached - posted by podioss <gr...@hotmail.com> on 2015/04/25 15:36:44 UTC, 1 replies.
- Spark SQL 1.3.1: java.lang.ClassCastException is thrown - posted by do...@sina.com on 2015/04/25 15:59:52 UTC, 0 replies.
- Re: Spark SQL 1.3.1: java.lang.ClassCastException is thrown - posted by Ted Yu <yu...@gmail.com> on 2015/04/25 16:04:33 UTC, 1 replies.
- 回复:Re: Spark SQL 1.3.1: java.lang.ClassCastException is thrown - posted by do...@sina.com on 2015/04/25 16:22:52 UTC, 1 replies.
- directory loader in windows - posted by ayan guha <gu...@gmail.com> on 2015/04/25 16:38:58 UTC, 6 replies.
- Re: spark1.3.1 using mysql error! - posted by Anand Mohan <ch...@gmail.com> on 2015/04/25 18:56:00 UTC, 0 replies.
- Querying Cluster State - posted by James King <ja...@gmail.com> on 2015/04/26 10:31:20 UTC, 6 replies.
- SQL UDF returning object of case class; regression from 1.2.0 - posted by Ophir Cohen <op...@gmail.com> on 2015/04/26 13:26:05 UTC, 1 replies.
- Timeout Error - posted by Deepak Gopalakrishnan <dg...@gmail.com> on 2015/04/26 15:57:32 UTC, 5 replies.
- Complexity of transformations in Spark - posted by Vijayasarathy Kannan <kv...@vt.edu> on 2015/04/26 19:42:21 UTC, 1 replies.
- Re: Spark timeout issue - posted by Patrick Wendell <pw...@gmail.com> on 2015/04/26 23:28:04 UTC, 2 replies.
- Spark SQL - Registerfunction throwing MissingRequirementError in JavaMirror with primordial classloader - posted by Sunita Arvind <su...@gmail.com> on 2015/04/26 23:29:53 UTC, 0 replies.
- Understand the running time of SparkSQL queries - posted by Wenlei Xie <we...@gmail.com> on 2015/04/27 06:25:00 UTC, 1 replies.
- Question on Spark SQL performance of Range Queries on Large Datasets - posted by Mani <ma...@vt.edu> on 2015/04/27 09:58:29 UTC, 1 replies.
- A problem of using spark streaming to capture network packets - posted by Hai Shan Wu <wu...@cn.ibm.com> on 2015/04/27 11:03:49 UTC, 0 replies.
- How to distribute Spark computation recipes - posted by Olivier Girardot <o....@lateral-thoughts.com> on 2015/04/27 11:49:44 UTC, 1 replies.
- Spark + Mesos + HDFS resource split - posted by Ankur Chauhan <ac...@brightcove.com> on 2015/04/27 11:59:39 UTC, 0 replies.
- spark-defaults.conf - posted by James King <ja...@gmail.com> on 2015/04/27 12:55:09 UTC, 3 replies.
- ReduceByKey and sorting within partitions - posted by Marco <ma...@gmail.com> on 2015/04/27 13:00:26 UTC, 3 replies.
- Spark 1.2.1: How to convert SchemaRDD to CassandraRDD? - posted by Tash Chainar <tc...@gmail.com> on 2015/04/27 13:15:12 UTC, 0 replies.
- Bigints in pyspark - posted by jamborta <ja...@gmail.com> on 2015/04/27 14:44:32 UTC, 0 replies.
- Exception in using updateStateByKey - posted by Sea <26...@qq.com> on 2015/04/27 15:21:18 UTC, 1 replies.
- 回复: Exception in using updateStateByKey - posted by Sea <26...@qq.com> on 2015/04/27 16:32:01 UTC, 1 replies.
- data locality in spark - posted by Grandl Robert <rg...@yahoo.com.INVALID> on 2015/04/27 17:30:03 UTC, 0 replies.
- Driver ID from spark-submit - posted by Rares Vernica <rv...@gmail.com> on 2015/04/27 17:38:38 UTC, 0 replies.
- Spark JDBC data source API issue with mysql - posted by madhu phatak <ph...@gmail.com> on 2015/04/27 17:42:26 UTC, 0 replies.
- Spark 1.3.1 Hadoop 2.4 Prebuilt package broken ? - posted by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com on 2015/04/27 17:43:11 UTC, 3 replies.
- Re: What is difference btw reduce & fold? - posted by keegan <ke...@l-3com.com> on 2015/04/27 18:27:47 UTC, 0 replies.
- spark sql LEFT OUTER JOIN java.lang.ClassCastException - posted by kiran mavatoor <ki...@yahoo.com.INVALID> on 2015/04/27 19:39:20 UTC, 0 replies.
- Streaming app with windowing and persistence - posted by Alexander Krasheninnikov <a....@corp.badoo.com> on 2015/04/27 19:57:30 UTC, 0 replies.
- deos randomSplit return a copy or a reference to the original rdd? [Python] - posted by "Pagliari, Roberto" <rp...@appcomsci.com> on 2015/04/27 20:28:28 UTC, 0 replies.
- Group by order by - posted by "Ulanov, Alexander" <al...@hp.com> on 2015/04/27 21:07:34 UTC, 4 replies.
- Understanding Spark's caching - posted by Eran Medan <er...@gmail.com> on 2015/04/27 21:28:44 UTC, 2 replies.
- bug: numClasses is not a valid argument of LogisticRegressionWithSGD - posted by "Pagliari, Roberto" <rp...@appcomsci.com> on 2015/04/27 21:37:02 UTC, 0 replies.
- [SQL][1.3.1][JAVA] UDF in java cause Task not serializable - posted by Shuai Zheng <sz...@gmail.com> on 2015/04/27 21:53:05 UTC, 0 replies.
- [SQL][1.3.1][JAVA]Use UDF in DataFrame join - posted by Shuai Zheng <sz...@gmail.com> on 2015/04/27 21:57:01 UTC, 0 replies.
- Change ivy cache for spark on Windows - posted by mj <jo...@gmail.com> on 2015/04/27 22:49:31 UTC, 1 replies.
- Re: hive-thriftserver maven artifact - posted by Ted Yu <yu...@gmail.com> on 2015/04/27 23:51:18 UTC, 2 replies.
- Automatic Cache in SparkSQL - posted by Wenlei Xie <we...@gmail.com> on 2015/04/27 23:59:14 UTC, 1 replies.
- Re: Powered By Spark - posted by Justin <ju...@atp.io> on 2015/04/28 02:20:15 UTC, 0 replies.
- Scalability of group by - posted by "Ulanov, Alexander" <al...@hp.com> on 2015/04/28 03:28:04 UTC, 4 replies.
- Why Spark is much faster than Hadoop MapReduce even on disk - posted by "bit1129@163.com" <bi...@163.com> on 2015/04/28 04:33:33 UTC, 5 replies.
- java.lang.UnsupportedOperationException: empty collection - posted by xweb <as...@gmail.com> on 2015/04/28 04:39:38 UTC, 1 replies.
- 1.3.1: Persisting RDD in parquet - "Conflicting partition column names" - posted by sranga <sr...@gmail.com> on 2015/04/28 05:20:26 UTC, 2 replies.
- Spark 1.3.1 JavaStreamingContext - fileStream compile error - posted by lokeshkumar <lo...@dataken.net> on 2015/04/28 05:44:48 UTC, 2 replies.
- New JIRA - [SQL] Can't remove columns from DataFrame or save DataFrame from a join due to duplicate columns - posted by Don Drake <do...@gmail.com> on 2015/04/28 06:41:45 UTC, 1 replies.
- Spark Job fails with 6 executors and succeeds with 8 ? - posted by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com on 2015/04/28 07:19:49 UTC, 0 replies.
- Serialization error - posted by madhvi <ma...@orkash.com> on 2015/04/28 07:41:08 UTC, 4 replies.
- Re: Unable to work with foreachrdd - posted by drarse <dr...@gmail.com> on 2015/04/28 07:42:30 UTC, 0 replies.
- Re: Spark Streaming: JavaDStream compute method NPE - posted by Himanshu Mehra <hi...@gmail.com> on 2015/04/28 08:37:35 UTC, 0 replies.
- How to add jars to standalone pyspark program - posted by mj <jo...@gmail.com> on 2015/04/28 10:06:51 UTC, 4 replies.
- Spark Sql: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient - posted by LinQili <li...@outlook.com> on 2015/04/28 10:42:50 UTC, 0 replies.
- [Spark SQL] Problems creating a table in specified schema/database - posted by James Aley <ja...@swiftkey.com> on 2015/04/28 10:54:23 UTC, 1 replies.
- Re: A problem of using spark streaming to capture network packets - posted by Dean Wampler <de...@gmail.com> on 2015/04/28 14:07:25 UTC, 5 replies.
- Spark partitioning question - posted by Marius Danciu <ma...@gmail.com> on 2015/04/28 14:10:24 UTC, 3 replies.
- submitting to multiple masters - posted by James King <ja...@gmail.com> on 2015/04/28 14:13:36 UTC, 2 replies.
- Calculating the averages for each KEY in a Pairwise (K,V) RDD ... - posted by "subscriptions@prismalytics.io" <su...@prismalytics.io> on 2015/04/28 16:00:27 UTC, 5 replies.
- Single stream with series of transformations - posted by "jc.francisco" <jc...@gmail.com> on 2015/04/28 16:28:26 UTC, 0 replies.
- Best practices on testing Spark jobs - posted by Michal Michalski <mi...@boxever.com> on 2015/04/28 17:32:32 UTC, 4 replies.
- Spark streaming - textFileStream/fileStream - Get file name - posted by lokeshkumar <lo...@dataken.net> on 2015/04/28 18:13:09 UTC, 8 replies.
- How to run self-build spark on EC2? - posted by Bo Fu <bo...@uchicago.edu> on 2015/04/28 18:15:04 UTC, 1 replies.
- Re: How to deploy self-build spark source code on EC2 - posted by Nicholas Chammas <ni...@gmail.com> on 2015/04/28 18:19:35 UTC, 0 replies.
- solr in spark - posted by Jeetendra Gangele <ga...@gmail.com> on 2015/04/28 18:27:46 UTC, 7 replies.
- PySpark: slicing issue with dataframes - posted by Ali Bajwa <al...@gmail.com> on 2015/04/28 18:53:20 UTC, 0 replies.
- Spark - Timeout Issues - OutOfMemoryError - posted by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com on 2015/04/28 19:05:34 UTC, 4 replies.
- Code For Loading Graph from Edge Tuple File - posted by geek2 <so...@gmail.com> on 2015/04/28 19:24:24 UTC, 0 replies.
- How to run customized Spark on EC2? - posted by Bo Fu <bo...@uchicago.edu> on 2015/04/28 19:29:59 UTC, 1 replies.
- default number of reducers - posted by Shushant Arora <sh...@gmail.com> on 2015/04/28 19:30:50 UTC, 1 replies.
- How to setup this "false streaming" problem - posted by Toni Cebrián <tc...@enerbyte.com> on 2015/04/28 19:38:14 UTC, 1 replies.
- MLLib SVMWithSGD is failing for large dataset - posted by sarath <sa...@gmail.com> on 2015/04/28 19:50:56 UTC, 2 replies.
- Initial tasks in job take time - posted by Anshul Singhle <an...@betaglide.com> on 2015/04/28 19:58:19 UTC, 2 replies.
- rdd.count with 100 elements taking 1 second to run - posted by Anshul Singhle <an...@betaglide.com> on 2015/04/28 22:17:40 UTC, 2 replies.
- Weird error/exception - posted by Vadim Bichutskiy <va...@gmail.com> on 2015/04/28 22:26:29 UTC, 2 replies.
- Re: Spark SQL 1.3.1 "saveAsParquetFile" will output tachyon file with different block size - posted by Calvin Jia <ji...@gmail.com> on 2015/04/28 23:23:42 UTC, 0 replies.
- Metric collection - posted by Giovanni Paolo Gibilisco <gi...@gmail.com> on 2015/04/29 00:14:54 UTC, 0 replies.
- Re: HBase HTable constructor hangs - posted by tridib <tr...@live.com> on 2015/04/29 04:12:03 UTC, 8 replies.
- Question about Memory Used and VCores Used - posted by "bit1129@163.com" <bi...@163.com> on 2015/04/29 04:12:55 UTC, 2 replies.
- External Application Run Status - posted by "Nastooh Avessta (navesta)" <na...@cisco.com> on 2015/04/29 06:08:50 UTC, 2 replies.
- How to stream all data out of a Kafka topic once, then terminate job? - posted by dgoldenberg <dg...@gmail.com> on 2015/04/29 06:34:35 UTC, 7 replies.
- Equal Height and Depth Binning in Spark - posted by kundan kumar <ii...@gmail.com> on 2015/04/29 07:22:29 UTC, 0 replies.
- How to set DEBUG level log of spark executor on Standalone deploy mode - posted by eric wong <wi...@gmail.com> on 2015/04/29 09:05:49 UTC, 0 replies.
- How Spark SQL supports primary and secondary indexes - posted by Nikolay Tikhonov <ti...@gmail.com> on 2015/04/29 12:53:33 UTC, 1 replies.
- Re: Driver memory leak? - posted by Sean Owen <so...@cloudera.com> on 2015/04/29 13:01:55 UTC, 4 replies.
- How to group multiple row data ? - posted by bipin <bi...@gmail.com> on 2015/04/29 13:12:46 UTC, 3 replies.
- java.io.IOException: No space left on device - posted by Selim Namsi <se...@gmail.com> on 2015/04/29 13:13:39 UTC, 5 replies.
- implicit function in SparkStreaming - posted by "guoqing0629@yahoo.com.hk" <gu...@yahoo.com.hk> on 2015/04/29 13:20:38 UTC, 5 replies.
- Dataframe filter based on another Dataframe - posted by Olivier Girardot <ss...@gmail.com> on 2015/04/29 13:23:53 UTC, 2 replies.
- spark with standalone HBase - posted by Saurabh Gupta <sa...@semusi.com> on 2015/04/29 13:27:17 UTC, 5 replies.
- MLib KMeans on large dataset issues - posted by Sam Stoelinga <sa...@gmail.com> on 2015/04/29 13:53:30 UTC, 3 replies.
- DataFrame filter referencing error - posted by Francesco Bigarella <fr...@gmail.com> on 2015/04/29 13:56:58 UTC, 3 replies.
- Join between Streaming data vs Historical Data in spark - posted by Rendy Bambang Junior <re...@gmail.com> on 2015/04/29 16:11:40 UTC, 1 replies.
- HOw can I merge multiple DataFrame and remove duplicated key - posted by "Wang, Ningjun (LNG-NPV)" <ni...@lexisnexis.com> on 2015/04/29 18:12:45 UTC, 4 replies.
- Spark on Cassandra - posted by Matthew Johnson <ma...@algomi.com> on 2015/04/29 19:08:42 UTC, 2 replies.
- Performance advantage by loading data from local node over S3. - posted by Nisrina Luthfiyati <ni...@gmail.com> on 2015/04/29 19:20:05 UTC, 1 replies.
- Extra stage that executes before triggering computation with an action - posted by Tom Hubregtsen <th...@gmail.com> on 2015/04/29 19:47:30 UTC, 3 replies.
- Sort (order by) of the big dataset - posted by "Ulanov, Alexander" <al...@hp.com> on 2015/04/29 22:07:38 UTC, 1 replies.
- Compute pairwise distance - posted by "Driesprong, Fokko" <fo...@driesprong.frl> on 2015/04/29 22:11:20 UTC, 3 replies.
- Hardware provisioning for Spark SQl - posted by Pietro Gentile <pi...@gmail.com> on 2015/04/29 22:47:55 UTC, 0 replies.
- Too many open files when using Spark to consume messages from Kafka - posted by Bill Jay <bi...@gmail.com> on 2015/04/29 23:07:35 UTC, 10 replies.
- multiple programs compilation by sbt. - posted by Dan Dong <do...@gmail.com> on 2015/04/29 23:45:03 UTC, 2 replies.
- Kryo serialization of classes in additional jars - posted by Akshat Aranya <aa...@gmail.com> on 2015/04/30 01:42:52 UTC, 0 replies.
- How to install spark in spark on yarn mode - posted by xiaohe lan <zo...@gmail.com> on 2015/04/30 06:01:39 UTC, 3 replies.
- Event generator for SPARK-Streaming from csv - posted by anshu shukla <an...@gmail.com> on 2015/04/30 07:13:44 UTC, 0 replies.
- spark kryo serialization question - posted by 邓刚, , 技术中心, , tr...@vipshop.com on 2015/04/30 07:34:56 UTC, 0 replies.
- Enabling Event Log - posted by James King <ja...@gmail.com> on 2015/04/30 08:22:49 UTC, 0 replies.
- The Processing loading of Spark streaming on YARN is not in balance - posted by Kyle Lin <ky...@gmail.com> on 2015/04/30 08:38:54 UTC, 4 replies.
- Creating StructType with DataFrame.withColumn - posted by Justin Yip <yi...@prediction.io> on 2015/04/30 09:17:44 UTC, 0 replies.
- SparkContext.getCallSite is in the top of profiler by memory allocation - posted by Igor Petrov <ig...@gmail.com> on 2015/04/30 10:42:02 UTC, 0 replies.
- Best strategy for Pandas -> Spark - posted by Olivier Girardot <o....@lateral-thoughts.com> on 2015/04/30 10:42:23 UTC, 0 replies.
- Spark pre-built for Hadoop 2.6 - posted by Christophe Préaud <ch...@kelkoo.com> on 2015/04/30 12:00:51 UTC, 1 replies.
- RE: Is SQLContext thread-safe? - posted by Haopu Wang <HW...@qilinsoft.com> on 2015/04/30 12:50:10 UTC, 2 replies.
- Expert advise needed. (POC is at crossroads) - posted by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com on 2015/04/30 15:18:12 UTC, 0 replies.
- is there anyway to enforce Spark to cache data in all worker nodes (almost equally) ? - posted by shahab <sh...@gmail.com> on 2015/04/30 15:42:27 UTC, 0 replies.
- [Runing Spark Applications with Chronos or Marathon] - posted by Aram Mkrtchyan <ar...@gmail.com> on 2015/04/30 15:45:41 UTC, 0 replies.
- Error when saving as parquet to S3 from Spark - posted by Cosmin Cătălin Sanda <co...@gmail.com> on 2015/04/30 15:51:38 UTC, 0 replies.
- RE: is there anyway to enforce Spark to cache data in all worker nodes(almost equally) ? - posted by Alex <lx...@gmail.com> on 2015/04/30 16:21:52 UTC, 1 replies.
- real time Query engine Spark-SQL on Hbase - posted by Siddharth Ubale <si...@syncoms.com> on 2015/04/30 16:23:41 UTC, 0 replies.
- Re: real time Query engine Spark-SQL on Hbase - posted by Ted Yu <yu...@gmail.com> on 2015/04/30 16:54:49 UTC, 0 replies.
- Re: SparkStream saveAsTextFiles() - posted by anavidad <an...@gmail.com> on 2015/04/30 17:09:09 UTC, 0 replies.
- Error when saving as parquet to S3 - posted by cosmincatalin <co...@gmail.com> on 2015/04/30 17:38:14 UTC, 0 replies.
- JavaRDD> flatMap Lexicographical Permutations - Java Heap Error - posted by "Dan DeCapria, CivicScience" <da...@civicscience.com> on 2015/04/30 17:58:58 UTC, 2 replies.
- sap hana database laod using jdbcRDD - posted by Hafiz Mujadid <ha...@gmail.com> on 2015/04/30 22:51:11 UTC, 0 replies.