user@spark.apache.org, 2015-07

You are viewing a plain text version of this content. The canonical link for it is here.

- Re: Retrieve hadoop conf object from Python API - posted by ayan guha <gu...@gmail.com> on 2015/07/01 01:23:06 UTC, 2 replies.
- Re: run reduceByKey on huge data in spark - posted by "barge.nilesh" <ba...@gmail.com> on 2015/07/01 02:26:57 UTC, 0 replies.
- 1.4.0 - posted by Exie <tf...@prodevelop.com.au> on 2015/07/01 03:20:14 UTC, 2 replies.
- output folder structure not getting commited and remains as _temporary - posted by nkd <ka...@gmail.com> on 2015/07/01 03:28:00 UTC, 1 replies.
- Spark 1.4.0: Parquet partitions / folder hierarchy changed from 1.3.1 - posted by Exie <tf...@prodevelop.com.au> on 2015/07/01 03:33:16 UTC, 0 replies.
- 回复：Re: got "java.lang.reflect.UndeclaredThrowableException" when running multiply APPs in spark - posted by lu...@sina.com on 2015/07/01 03:58:22 UTC, 0 replies.
- Re: Subsecond queries possible? - posted by Eric Pederson <er...@gmail.com> on 2015/07/01 04:29:02 UTC, 2 replies.
- Re: Spark 1.4.0: read.df() causes excessive IO - posted by Exie <tf...@prodevelop.com.au> on 2015/07/01 05:27:01 UTC, 0 replies.
- Re: s3 bucket access/read file - posted by Exie <tf...@prodevelop.com.au> on 2015/07/01 05:31:27 UTC, 3 replies.
- Re: Spark run errors on Raspberry Pi - posted by Exie <tf...@prodevelop.com.au> on 2015/07/01 05:40:51 UTC, 1 replies.
- Issue with parquet write after join (Spark 1.4.0) - posted by Pooja Jain <po...@gmail.com> on 2015/07/01 06:50:46 UTC, 4 replies.
- Re: Spark Dataframe 1.4 (GroupBy partial match) - posted by Suraj Shetiya <su...@gmail.com> on 2015/07/01 07:30:49 UTC, 4 replies.
- Re: Difference between spark-defaults.conf and SparkConf.set - posted by Akhil Das <ak...@sigmoidanalytics.com> on 2015/07/01 08:27:44 UTC, 1 replies.
- Re: How to stop making Multiple copies in memory when running multiple Spark jobs? - posted by Himanshu Mehra <hi...@gmail.com> on 2015/07/01 08:28:02 UTC, 1 replies.
- Run multiple Spark jobs concurrently - posted by Nirmal Fernando <ni...@wso2.com> on 2015/07/01 08:31:22 UTC, 2 replies.
- Re: Issues in reading a CSV file from local file system using spark-shell - posted by Akhil Das <ak...@sigmoidanalytics.com> on 2015/07/01 08:44:26 UTC, 0 replies.
- question about resource allocation on the spark standalone cluster - posted by Tomer Benyamini <to...@gmail.com> on 2015/07/01 08:45:07 UTC, 0 replies.
- upload to s3, UI Total Duration and Sum of Job Durations - posted by "igor.berman" <ig...@gmail.com> on 2015/07/01 09:14:52 UTC, 0 replies.
- coalesce on dataFrame - posted by Hafiz Mujadid <ha...@gmail.com> on 2015/07/01 09:22:30 UTC, 3 replies.
- Can I do Joins across Event Streams ? - posted by Spark Enthusiast <sp...@yahoo.in> on 2015/07/01 09:36:28 UTC, 1 replies.
- Re: Can Dependencies Be Resolved on Spark Cluster? - posted by SLiZn Liu <sl...@gmail.com> on 2015/07/01 09:37:30 UTC, 0 replies.
- Spark program running infinitely - posted by Ladle <la...@tcs.com> on 2015/07/01 09:55:09 UTC, 0 replies.
- RE: Spark streaming on standalone cluster - posted by pr...@wipro.com on 2015/07/01 10:30:37 UTC, 2 replies.
- Re: Check for null in PySpark DataFrame - posted by Olivier Girardot <ss...@gmail.com> on 2015/07/01 12:07:18 UTC, 2 replies.
- Issues when saving dataframe in Spark 1.4 with parquet format - posted by David Sabater Dinter <da...@gmail.com> on 2015/07/01 12:57:05 UTC, 0 replies.
- Can a Spark Driver Program be a REST Service by itself? - posted by Spark Enthusiast <sp...@yahoo.in> on 2015/07/01 13:02:25 UTC, 2 replies.
- Json Dataframe formation and Querying - posted by "Chaudhary, Umesh" <Um...@searshc.com> on 2015/07/01 13:45:48 UTC, 0 replies.
- Illegal access error when initializing SparkConf - posted by Ramprakash Ramamoorthy <yo...@gmail.com> on 2015/07/01 15:37:19 UTC, 1 replies.
- Passing name of package in sparkR.init() - posted by Sourav Mazumder <so...@gmail.com> on 2015/07/01 15:38:26 UTC, 0 replies.
- Making Unpersist Lazy - posted by Jem Tucker <je...@gmail.com> on 2015/07/01 16:18:42 UTC, 3 replies.
- custom RDD in java - posted by Shushant Arora <sh...@gmail.com> on 2015/07/01 16:19:59 UTC, 6 replies.
- StorageLevel.MEMORY_AND_DISK_SER - posted by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com on 2015/07/01 17:01:31 UTC, 7 replies.
- BroadcastHashJoin when RDD is not cached - posted by Srikanth <sr...@gmail.com> on 2015/07/01 17:30:20 UTC, 2 replies.
- binaryFiles() for 1 million files, too much memory required - posted by Konstantinos Kougios <ko...@googlemail.com> on 2015/07/01 18:06:25 UTC, 1 replies.
- import errors with Eclipse Scala - posted by Stefan Panayotov <sp...@msn.com> on 2015/07/01 18:57:09 UTC, 4 replies.
- Convert CSV lines to List of Objects - posted by Ashish Soni <as...@gmail.com> on 2015/07/01 19:00:14 UTC, 1 replies.
- Re: sparkR could not find function "textFile" - posted by Sourav Mazumder <so...@gmail.com> on 2015/07/01 19:03:33 UTC, 2 replies.
- Re: breeze.linalg.DenseMatrix not found - posted by Alex Gittens <sw...@gmail.com> on 2015/07/01 19:07:06 UTC, 0 replies.
- Custom order by in Spark SQL - posted by Mick Davies <Mi...@gmail.com> on 2015/07/01 19:25:09 UTC, 1 replies.
- spark.streaming.receiver.maxRate Not taking effect - posted by Laeeq Ahmed <la...@yahoo.com.INVALID> on 2015/07/01 19:27:14 UTC, 1 replies.
- Re: How to recover in case user errors in streaming - posted by Amit Assudani <aa...@impetus.com> on 2015/07/01 19:35:45 UTC, 2 replies.
- Re: Need clarification on spark on cluster set up instruction - posted by Alex Gittens <sw...@gmail.com> on 2015/07/01 19:38:47 UTC, 0 replies.
- Re: Spark driver using Spark Streaming shows increasing memory/CPU usage - posted by Neil Mayo <Ne...@velocityww.com> on 2015/07/01 19:51:08 UTC, 1 replies.
- making dataframe for different types using spark-csv - posted by Hafiz Mujadid <ha...@gmail.com> on 2015/07/01 20:03:12 UTC, 5 replies.
- Task InputSize source code location - posted by Shiyao Ma <i...@introo.me> on 2015/07/01 20:04:02 UTC, 0 replies.
- BroadCast Multiple DataFrame ( JDBC Tables ) - posted by Ashish Soni <as...@gmail.com> on 2015/07/01 20:14:35 UTC, 1 replies.
- Spark Standalone Cluster - Slave not connecting to Master - posted by rshyam <hs...@gmail.com> on 2015/07/01 20:35:53 UTC, 2 replies.
- Use of Apache Spark with R package SNOW, or perhaps Hadoop YARN with same SNOW? - posted by "Galkowski, Jan" <jg...@akamai.com> on 2015/07/01 21:57:56 UTC, 0 replies.
- Re: How to disable parquet schema merging in 1.4? - posted by Cheng Lian <li...@gmail.com> on 2015/07/01 22:13:28 UTC, 0 replies.
- Calling MLLib from SparkR - posted by Sourav Mazumder <so...@gmail.com> on 2015/07/02 01:23:30 UTC, 2 replies.
- DataFrame Filter Inside Another Data Frame Map - posted by Ashish Soni <as...@gmail.com> on 2015/07/02 02:43:17 UTC, 5 replies.
- KMeans questions - posted by Eric Friedman <er...@gmail.com> on 2015/07/02 02:53:00 UTC, 1 replies.
- DataFrame Find/Filter Based on Input - Inside Map function - posted by Ashish Soni <as...@gmail.com> on 2015/07/02 04:15:21 UTC, 3 replies.
- Spark on Hadoop 2.5.2 - posted by Xiaoyu Ma <hz...@corp.netease.com> on 2015/07/02 05:22:21 UTC, 0 replies.
- Error with splitting contents of a dataframe column using Spark 1.4 for nested complex json file - posted by Mike Tracy <mi...@gmail.com> on 2015/07/02 07:18:10 UTC, 0 replies.
- Meets class not found error in spark console with newly hive context - posted by Terry Hole <hu...@gmail.com> on 2015/07/02 07:20:50 UTC, 2 replies.
- getting WARN ReliableDeliverySupervisor - posted by xiaohe lan <zo...@gmail.com> on 2015/07/02 07:30:28 UTC, 1 replies.
- Re: Performance tuning in Spark SQL. - posted by prosp4300 <pr...@163.com> on 2015/07/02 08:32:04 UTC, 0 replies.
- .NET on Apache Spark? - posted by Zwits <Da...@ortec-finance.com> on 2015/07/02 10:33:18 UTC, 7 replies.
- Re: Spark driver hangs on start of job - posted by Sjoerd Mulder <sj...@gmail.com> on 2015/07/02 10:37:00 UTC, 1 replies.
- "insert overwrite table phonesall" in spark-sql resulted in java.io.StreamCorruptedException - posted by John Jay <zj...@gmail.com> on 2015/07/02 11:22:39 UTC, 0 replies.
- All master are unreponsive issue - posted by lu...@sina.com on 2015/07/02 11:31:06 UTC, 2 replies.
- EventLoggingListener threw an exception when sparkContext.stop - posted by Ayoub <be...@gmail.com> on 2015/07/02 11:48:33 UTC, 0 replies.
- Spark SQL and Streaming - How to execute JDBC Query only once - posted by Ashish Soni <as...@gmail.com> on 2015/07/02 14:40:48 UTC, 1 replies.
- Starting Spark without automatically starting HiveContext - posted by Daniel Haviv <da...@veracity-group.com> on 2015/07/02 14:41:18 UTC, 4 replies.
- override/update options in Dataframe/JdbcRdd - posted by manohar <ma...@gmail.com> on 2015/07/02 15:10:09 UTC, 0 replies.
- Array fields in dataframe.write.jdbc - posted by Anand Nalya <an...@gmail.com> on 2015/07/02 15:49:08 UTC, 0 replies.
- thrift-server does not load jars files (Azure HDInsight) - posted by Daniel Haviv <da...@veracity-group.com> on 2015/07/02 16:38:18 UTC, 5 replies.
- Dataframe in single partition after sorting? - posted by Cesar Flores <ce...@gmail.com> on 2015/07/02 16:44:46 UTC, 0 replies.
- wholeTextFiles("/x/*/*.txt") runs single threaded - posted by Kostas Kougios <ko...@googlemail.com> on 2015/07/02 18:05:31 UTC, 1 replies.
- Fwd: map vs foreach for sending data to external system - posted by Alexandre Rodrigues <al...@gmail.com> on 2015/07/02 18:32:18 UTC, 5 replies.
- sliding - posted by tog <gu...@gmail.com> on 2015/07/02 18:37:27 UTC, 6 replies.
- Re: Grouping runs of elements in a RDD - posted by Mohit Jaggi <mo...@gmail.com> on 2015/07/02 19:27:43 UTC, 1 replies.
- Setting JVM heap start and max sizes, -Xms and -Xmx, for executors - posted by Mulugeta Mammo <mu...@gmail.com> on 2015/07/02 21:05:54 UTC, 6 replies.
- 1.4.0 regression: out-of-memory errors on small data - posted by sim <si...@swoop.com> on 2015/07/02 21:40:47 UTC, 16 replies.
- Re: is there any significant performance issue converting between rdd and dataframes in pyspark? - posted by Davies Liu <da...@databricks.com> on 2015/07/02 22:21:30 UTC, 0 replies.
- where is the source code for org.apache.spark.launcher.Main? - posted by Shiyao Ma <i...@introo.me> on 2015/07/02 23:58:53 UTC, 1 replies.
- configuring max sum of cores and memory in cluster through command line - posted by Alexander Waldin <aw...@inflection.com> on 2015/07/03 00:20:31 UTC, 1 replies.
- Re: Spark launching without all of the requested YARN resources - posted by Arun Luthra <ar...@gmail.com> on 2015/07/03 00:48:06 UTC, 0 replies.
- duplicate names in sql allowed? - posted by Koert Kuipers <ko...@tresata.com> on 2015/07/03 00:59:56 UTC, 3 replies.
- import pyspark.sql.Row gives error in 1.4.1 - posted by Krishna Sankar <ks...@gmail.com> on 2015/07/03 03:33:15 UTC, 0 replies.
- Aggregating the same column multiple times - posted by sim <si...@swoop.com> on 2015/07/03 04:36:00 UTC, 0 replies.
- 回复：All master are unreponsive issue - posted by lu...@sina.com on 2015/07/03 05:35:02 UTC, 0 replies.
- Spark Thriftserver exec insert sql got error on Hadoop federation - posted by Xiaoyu Wang <wa...@jd.com> on 2015/07/03 05:37:13 UTC, 0 replies.
- Re: reduceByKeyAndWindow, but using log timestamps instead of clock seconds - posted by Tobias Pfeiffer <tg...@preferred.jp> on 2015/07/03 06:05:29 UTC, 0 replies.
- Spark MLLib 140 - logistic regression with SGD model accuracy is different in local mode and cluster mode - posted by Nirmal Fernando <ni...@wso2.com> on 2015/07/03 06:10:31 UTC, 0 replies.
- Re: Solving Systems of Linear Equations Using Spark? - posted by jamaica <my...@gmail.com> on 2015/07/03 08:17:49 UTC, 0 replies.
- Kryo fails to serialise output - posted by Dominik Hübner <co...@dhuebner.com> on 2015/07/03 08:44:55 UTC, 1 replies.
- Streaming: updating broadcast variables - posted by James Cole <ja...@binarism.net> on 2015/07/03 08:47:18 UTC, 2 replies.
- Optimizations - posted by Marius Danciu <ma...@gmail.com> on 2015/07/03 09:13:20 UTC, 3 replies.
- [spark1.4] sparkContext.stop causes exception on Mesos - posted by Ayoub <be...@gmail.com> on 2015/07/03 09:32:31 UTC, 0 replies.
- Filter on Grouped Data - posted by Megha Sridhar- Cynepia <me...@cynepia.com> on 2015/07/03 09:59:08 UTC, 1 replies.
- Spark 1.4 MLLib Bug?: Multiclass Classification "requirement failed: sizeInBytes was negative" - posted by Danny Linden <ko...@dannylinden.de> on 2015/07/03 10:54:53 UTC, 2 replies.
- build spark 1.4 source code for sparkR with maven - posted by "1106944911@qq.com" <11...@qq.com> on 2015/07/03 10:57:30 UTC, 0 replies.
- Accessing the console from spark - posted by Jem Tucker <je...@gmail.com> on 2015/07/03 11:02:10 UTC, 6 replies.
- Re: build spark 1.4 source code for sparkR with maven - posted by Akhil Das <ak...@sigmoidanalytics.com> on 2015/07/03 11:14:16 UTC, 1 replies.
- Multiple Join Conditions in dataframe join - posted by bipin <bi...@gmail.com> on 2015/07/03 11:45:39 UTC, 0 replies.
- Spark Streaming broadcast to all keys - posted by micvog <mi...@micvog.com> on 2015/07/03 13:34:10 UTC, 1 replies.
- Spark performance issue - posted by diplomatic Guru <di...@gmail.com> on 2015/07/03 14:58:47 UTC, 1 replies.
- ERROR executor.CoarseGrainedExecutorBackend: RECEIVED SIGNAL 15: SIGTERM - posted by Kostas Kougios <ko...@googlemail.com> on 2015/07/03 16:16:00 UTC, 0 replies.
- Float type coercion on SparkR with hiveContext - posted by Evgeny Sinelnikov <es...@griddynamics.com> on 2015/07/03 16:17:40 UTC, 0 replies.
- Spark-csv into labeled points with null values - posted by Sa...@wellsfargo.com on 2015/07/03 17:04:06 UTC, 0 replies.
- SparkSQL cache table with multiple replicas - posted by David Sabater Dinter <da...@gmail.com> on 2015/07/03 17:33:00 UTC, 0 replies.
- Re: Spark SQL groupby timestamp - posted by sim <si...@swoop.com> on 2015/07/03 19:06:30 UTC, 0 replies.
- Experience with centralised logging for Spark? - posted by Edward Sargisson <ej...@gmail.com> on 2015/07/03 20:34:36 UTC, 0 replies.
- Are Spark Streaming RDDs always processed in order? - posted by khaledh <kh...@gmail.com> on 2015/07/04 04:12:44 UTC, 4 replies.
- SparkR and Spark Mlib - posted by praveen S <my...@gmail.com> on 2015/07/04 04:22:58 UTC, 1 replies.
- Re: How to timeout a task? - posted by William Ferrell <wf...@gmail.com> on 2015/07/04 05:08:45 UTC, 0 replies.
- Contineous errors trying to start spark-shell - posted by Mohamed Lrhazi <Mo...@georgetown.edu> on 2015/07/04 06:47:28 UTC, 0 replies.
- Feature Generation On Spark - posted by rishikesh <ri...@hotmail.com> on 2015/07/04 13:04:00 UTC, 6 replies.
- Authorisation issue in Spark while using SQL based Authorization - posted by PKUKILLA <pk...@gmail.com> on 2015/07/04 17:52:48 UTC, 0 replies.
- Get Spark version before starting context - posted by Patrick Woody <pa...@gmail.com> on 2015/07/04 22:00:14 UTC, 1 replies.
- calling HiveContext.table or running a query reads files unnecessarily in S3 - posted by Steve Lindemann <sr...@gmail.com> on 2015/07/05 00:45:39 UTC, 0 replies.
- Spark got stuck with BlockManager after computing connected components using GraphX - posted by Hellen <ho...@gmail.com> on 2015/07/05 00:57:44 UTC, 2 replies.
- JDBC Streams - posted by ayan guha <gu...@gmail.com> on 2015/07/05 01:58:51 UTC, 5 replies.
- text file stream to HDFS - posted by ravi tella <dd...@gmail.com> on 2015/07/05 02:23:34 UTC, 1 replies.
- Splitting dataframe using Spark 1.4 for nested json input - posted by Mike Tracy <mi...@gmail.com> on 2015/07/05 03:35:16 UTC, 0 replies.
- Restarting Spark Streaming Application with new code - posted by Vinoth Chandar <vi...@uber.com> on 2015/07/05 04:01:59 UTC, 2 replies.
- mvn build hangs on: Dependency-reduced POM written at bagel/dependency-reduced-pom.xml - posted by Alec Taylor <al...@gmail.com> on 2015/07/05 06:44:46 UTC, 2 replies.
- Why Kryo Serializer is slower than Java Serializer in TeraSort - posted by Gavin Liu <il...@gmail.com> on 2015/07/05 08:31:06 UTC, 3 replies.
- Futures timed out after 10000 milliseconds - posted by SamRoberts <sa...@yahoo.com> on 2015/07/05 08:40:27 UTC, 6 replies.
- Spark custom streaming receiver not storing data reliably? - posted by Ajit Bhingarkar <aj...@capiot.com> on 2015/07/05 17:42:13 UTC, 3 replies.
- Benchmark results between Flink and Spark - posted by Slim Baltagi <sb...@gmail.com> on 2015/07/05 19:24:10 UTC, 6 replies.
- cores and resource management - posted by nizang <ni...@windward.eu> on 2015/07/05 21:52:46 UTC, 1 replies.
- efficiently accessing partition data for datasets in S3 with SparkSQL - posted by Steve Lindemann <sr...@gmail.com> on 2015/07/05 23:47:26 UTC, 0 replies.
- Re: Spark-ImageAnalysis - posted by Slim Baltagi <sb...@gmail.com> on 2015/07/06 02:21:39 UTC, 0 replies.
- [SparkScore]Performance portal for Apache Spark - WW27 - posted by "Huang, Jie" <ji...@intel.com> on 2015/07/06 03:01:07 UTC, 2 replies.
- Can we allow executor to exit when tasks fail too many time? - posted by Tao Li <li...@gmail.com> on 2015/07/06 06:25:20 UTC, 2 replies.
- java.io.IOException: No space left on device--regd. - posted by Devarajan Srinivasan <de...@gmail.com> on 2015/07/06 07:14:01 UTC, 2 replies.
- lower and upper offset not working in spark with mysql database - posted by Hafiz Mujadid <ha...@gmail.com> on 2015/07/06 07:26:19 UTC, 2 replies.
- Aggregate to array (or 'slice by key') with DataFrames - posted by Alex Beatson <al...@gmail.com> on 2015/07/06 07:41:21 UTC, 0 replies.
- DESCRIBE FORMATTED doesn't work in Hive Thrift Server? - posted by Rex Xiong <by...@gmail.com> on 2015/07/06 07:53:20 UTC, 1 replies.
- Unable to start spark-sql - posted by sandeep vura <sa...@gmail.com> on 2015/07/06 08:12:33 UTC, 4 replies.
- Re: How to use caching in Spark Actions or Output operations? - posted by Himanshu Mehra <hi...@gmail.com> on 2015/07/06 08:53:01 UTC, 0 replies.
- How does Spark streaming move data around ? - posted by "Sela, Amit" <AN...@paypal.com.INVALID> on 2015/07/06 09:05:36 UTC, 0 replies.
- Spark SQL queries hive table, real time ? - posted by spierki <fl...@crisalid.com> on 2015/07/06 09:23:41 UTC, 3 replies.
- Split RDD into two in a single pass - posted by Anand Nalya <an...@gmail.com> on 2015/07/06 09:32:57 UTC, 1 replies.
- com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException in spark with mysql database - posted by Hafiz Mujadid <ha...@gmail.com> on 2015/07/06 09:38:06 UTC, 1 replies.
- Re: java.lang.IllegalArgumentException: A metric named ... already exists - posted by Juan Rodríguez Hortalá <ju...@gmail.com> on 2015/07/06 10:02:06 UTC, 1 replies.
- How to shut down spark web UI? - posted by lu...@sina.com on 2015/07/06 11:05:40 UTC, 1 replies.
- Spark-CSV: Multiple delimiters and Null fields support - posted by Anas Sherwani <an...@gmail.com> on 2015/07/06 11:15:04 UTC, 0 replies.
- [SPARK-SQL] Re-use col alias in the select clause to avoid sub query - posted by Hao Ren <in...@gmail.com> on 2015/07/06 12:02:01 UTC, 0 replies.
- Application jar file not found exception when submitting application - posted by "bit1129@163.com" <bi...@163.com> on 2015/07/06 12:14:44 UTC, 2 replies.
- [SparkR] Float type coercion with hiveContext - posted by Evgeny Sinelnikov <es...@griddynamics.com> on 2015/07/06 12:31:15 UTC, 3 replies.
- Spark's equivalent for Analytical functions in Oracle - posted by Gireesh Puthumana <gi...@augmentiq.in> on 2015/07/06 12:50:53 UTC, 1 replies.
- Spark equivalent for Oracle's analytical functions - posted by gireeshp <gi...@augmentiq.in> on 2015/07/06 13:06:29 UTC, 1 replies.
- writing to kafka using spark streaming - posted by Shushant Arora <sh...@gmail.com> on 2015/07/06 13:11:23 UTC, 5 replies.
- kafka offset commit in spark streaming 1.2 - posted by Shushant Arora <sh...@gmail.com> on 2015/07/06 14:11:20 UTC, 5 replies.
- How Will Spark Execute below Code - Driver and Executors - posted by Ashish Soni <as...@gmail.com> on 2015/07/06 14:57:53 UTC, 1 replies.
- Converting spark JDBCRDD to DataFrame - posted by Hafiz Mujadid <ha...@gmail.com> on 2015/07/06 15:42:41 UTC, 1 replies.
- How to call hiveContext.sql() on all the Hive partitions in parallel? - posted by kachau <um...@gmail.com> on 2015/07/06 19:21:55 UTC, 0 replies.
- How do we control output part files created by Spark job? - posted by kachau <um...@gmail.com> on 2015/07/06 19:23:20 UTC, 11 replies.
- How to create a LabeledPoint RDD from a Data Frame - posted by Sourav Mazumder <so...@gmail.com> on 2015/07/06 19:28:47 UTC, 1 replies.
- Cluster sizing for recommendations - posted by Danny Yates <da...@codeaholics.org> on 2015/07/06 20:58:23 UTC, 1 replies.
- Master doesn't start, no logs - posted by maxdml <ma...@gmail.com> on 2015/07/06 21:24:06 UTC, 4 replies.
- Run sparkR non-interactively - posted by "Kelly, Jonathan" <jo...@amazon.com> on 2015/07/06 21:24:57 UTC, 1 replies.
- User Defined Functions - Execution on Clusters - posted by "Eskilson,Aleksander" <Al...@Cerner.com> on 2015/07/06 21:55:43 UTC, 2 replies.
- Spark standalone cluster - Output file stored in temporary directory in worker - posted by MorEru <hs...@gmail.com> on 2015/07/06 22:47:31 UTC, 3 replies.
- Spark application with a RESTful API - posted by Sagi r <st...@gmail.com> on 2015/07/06 22:58:28 UTC, 3 replies.
- How to create empty RDD - posted by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com on 2015/07/06 23:11:15 UTC, 3 replies.
- How does executor cores change the spark job behavior ? - posted by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com on 2015/07/07 00:36:30 UTC, 0 replies.
- Job consistently failing after leftOuterJoin() - oddly sized / non-uniform partitions - posted by Mohammed Omer <be...@gmail.com> on 2015/07/07 00:57:40 UTC, 2 replies.
- Random Forest in MLLib - posted by Sourav Mazumder <so...@gmail.com> on 2015/07/07 01:46:09 UTC, 1 replies.
- How to debug java.io.OptionalDataException issues - posted by Yana Kadiyska <ya...@gmail.com> on 2015/07/07 02:35:08 UTC, 1 replies.
- Spark Unit tests - RDDBlockId not found - posted by Malte <ma...@gmail.com> on 2015/07/07 02:58:30 UTC, 0 replies.
- JVM is not ready after 10 seconds. - posted by Ashish Dutt <as...@gmail.com> on 2015/07/07 03:07:39 UTC, 0 replies.
- JVM is not ready after 10 seconds - posted by ashishdutt <as...@gmail.com> on 2015/07/07 03:11:51 UTC, 3 replies.
- Please add the Chicago Spark Users' Group to the community page - posted by Dean Wampler <de...@gmail.com> on 2015/07/07 04:06:43 UTC, 1 replies.
- Re: how to black list nodes on the cluster - posted by Gylfi <gy...@berkeley.edu> on 2015/07/07 04:11:16 UTC, 1 replies.
- The auxService:spark_shuffle does not exist - posted by roy <rp...@njit.edu> on 2015/07/07 04:24:26 UTC, 7 replies.
- 回复：Re: How to shut down spark web UI? - posted by lu...@sina.com on 2015/07/07 04:25:36 UTC, 0 replies.
- Hibench build fail - posted by lu...@sina.com on 2015/07/07 08:50:13 UTC, 2 replies.
- SparkSQL OOM issue - posted by sh...@tsmc.com on 2015/07/07 09:58:12 UTC, 2 replies.
- Maintain Persistent Connection with Hive meta store - posted by wazza <ra...@gmail.com> on 2015/07/07 11:07:06 UTC, 1 replies.
- HiveContext throws org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient - posted by bdev <bu...@gmail.com> on 2015/07/07 11:07:28 UTC, 1 replies.
- RECEIVED SIGNAL 15: SIGTERM - posted by Kostas Kougios <ko...@googlemail.com> on 2015/07/07 12:16:07 UTC, 6 replies.
- How to implement top() and filter() on object List for JavaRDD - posted by Hafsa Asif <ha...@matchinguu.com> on 2015/07/07 12:22:17 UTC, 8 replies.
- [SPARK-SQL] libgplcompression.so already loaded in another classloader - posted by Sea <26...@qq.com> on 2015/07/07 12:29:59 UTC, 1 replies.
- How to solve ThreadException in Apache Spark standalone Java Application - posted by Hafsa Asif <ha...@matchinguu.com> on 2015/07/07 12:38:07 UTC, 5 replies.
- (Unknown) - posted by Anand Nalya <an...@gmail.com> on 2015/07/07 13:34:02 UTC, 2 replies.
- RE: - posted by Evo Eftimov <ev...@isecc.com> on 2015/07/07 13:41:06 UTC, 6 replies.
- sparkr-submit additional R files - posted by Michał Zieliński <zi...@gmail.com> on 2015/07/07 14:13:29 UTC, 1 replies.
- Please add the Cincinnati spark meetup to the list of meet ups - posted by Darin McBeath <dd...@yahoo.com.INVALID> on 2015/07/07 14:35:32 UTC, 0 replies.
- Question about master memory requirement and GraphX pagerank performance ! - posted by Khaled Ammar <kh...@gmail.com> on 2015/07/07 14:40:38 UTC, 0 replies.
- is it possible to disable -XX:OnOutOfMemoryError=kill %p for the executors? - posted by Kostas Kougios <ko...@googlemail.com> on 2015/07/07 15:14:01 UTC, 4 replies.
- How to write mapreduce programming in spark by using java on user-defined javaPairRDD? - posted by 付雅丹 <ya...@gmail.com> on 2015/07/07 16:18:56 UTC, 1 replies.
- Spark Kafka Direct Streaming - posted by abi_pat <pr...@gmail.com> on 2015/07/07 16:42:44 UTC, 1 replies.
- Remoting started followed by a Remoting shut down straight away - posted by Kostas Kougios <ko...@googlemail.com> on 2015/07/07 16:42:47 UTC, 0 replies.
- Re: Error when connecting to Spark SQL via Hive JDBC driver - posted by ratio <gm...@gmx.de> on 2015/07/07 17:02:28 UTC, 1 replies.
- Best practice for using singletons on workers (seems unanswered) ? - posted by dgoldenberg <dg...@gmail.com> on 2015/07/07 17:04:33 UTC, 5 replies.
- spark-submit can not resolve spark-hive_2.10 - posted by Hao Ren <in...@gmail.com> on 2015/07/07 17:06:27 UTC, 2 replies.
- How to deal with null values on LabeledPoint - posted by Sa...@wellsfargo.com on 2015/07/07 17:35:26 UTC, 0 replies.
- Regarding master node failure - posted by swetha <sw...@gmail.com> on 2015/07/07 18:51:03 UTC, 1 replies.
- Re: java.lang.OutOfMemoryError: PermGen space - posted by jitender <ji...@sparklinedata.com> on 2015/07/07 18:53:56 UTC, 0 replies.
- Windows - endless "Dependency-reduced POM written..." in Bagel build - posted by Lincoln Atkinson <la...@microsoft.com> on 2015/07/07 19:04:46 UTC, 3 replies.
- Is it now possible to incrementally update a graph in GraphX - posted by Hellen <ho...@gmail.com> on 2015/07/07 19:24:01 UTC, 0 replies.
- DataFrame question - posted by Naveen Madhire <vm...@umail.iu.edu> on 2015/07/07 19:29:02 UTC, 1 replies.
- How to change hive database? - posted by Arun Luthra <ar...@gmail.com> on 2015/07/07 20:07:56 UTC, 2 replies.
- (de)serialize DStream - posted by Chen Song <ch...@gmail.com> on 2015/07/07 21:43:57 UTC, 1 replies.
- What else is need to setup native support of BLAS/LAPACK with Spark? - posted by Arun Ahuja <aa...@gmail.com> on 2015/07/07 21:47:42 UTC, 10 replies.
- unable to bring up cluster with ec2 script - posted by "Pagliari, Roberto" <rp...@appcomsci.com> on 2015/07/07 22:34:24 UTC, 2 replies.
- Does spark supports the Hive function posexplode function? - posted by Jeff J Li <li...@us.ibm.com> on 2015/07/07 23:10:11 UTC, 3 replies.
- Hive UDFs - posted by chrish2312 <ch...@palantir.com> on 2015/07/08 00:19:53 UTC, 1 replies.
- spark - redshift !!! - posted by spark user <sp...@yahoo.com.INVALID> on 2015/07/08 00:57:34 UTC, 4 replies.
- Parallelizing multiple RDD / DataFrame creation in Spark - posted by Brandon White <bw...@gmail.com> on 2015/07/08 02:04:41 UTC, 7 replies.
- Why can I not insert into TempTables in Spark SQL? - posted by Brandon White <bw...@gmail.com> on 2015/07/08 02:10:22 UTC, 0 replies.
- How to submit streaming application and exit - posted by Bin Wang <wb...@gmail.com> on 2015/07/08 05:13:32 UTC, 2 replies.
- how to use DoubleRDDFunctions on mllib Vector? - posted by 诺铁 <no...@gmail.com> on 2015/07/08 05:26:27 UTC, 3 replies.
- How to verify that the worker is connected to master in CDH5.4 - posted by Ashish Dutt <as...@gmail.com> on 2015/07/08 05:42:50 UTC, 6 replies.
- Catalyst Errors when building spark from trunk - posted by Stephen Boesch <ja...@gmail.com> on 2015/07/08 06:46:18 UTC, 0 replies.
- 回复:HiveContext throws org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient - posted by prosp4300 <pr...@163.com> on 2015/07/08 07:40:08 UTC, 0 replies.
- PySpark MLlib: py4j cannot find trainImplicitALSModel method - posted by sooraj <so...@gmail.com> on 2015/07/08 09:05:29 UTC, 5 replies.
- 回复：RE: Hibench build fail - posted by lu...@sina.com on 2015/07/08 09:38:31 UTC, 0 replies.
- Word2Vec distributed? - posted by Carsten Schnober <sc...@ukp.informatik.tu-darmstadt.de> on 2015/07/08 09:44:41 UTC, 1 replies.
- SnappyCompressionCodec on the master - posted by nizang <ni...@windward.eu> on 2015/07/08 09:47:20 UTC, 1 replies.
- How to upgrade Spark version in CDH 5.4 - posted by Ashish Dutt <as...@gmail.com> on 2015/07/08 10:03:45 UTC, 3 replies.
- 回复：回复：RE: Hibench build fail - posted by lu...@sina.com on 2015/07/08 10:14:05 UTC, 0 replies.
- UDF in spark - posted by vinod kumar <vi...@gmail.com> on 2015/07/08 10:53:15 UTC, 4 replies.
- Day of year - posted by Ravisankar Mani <rr...@gmail.com> on 2015/07/08 11:04:28 UTC, 0 replies.
- Using different users with spark thriftserver - posted by "Zalzberg, Idan (Agoda)" <Id...@agoda.com> on 2015/07/08 11:43:26 UTC, 0 replies.
- Is there a way to shutdown the derby in hive context in spark shell? - posted by Terry Hole <hu...@gmail.com> on 2015/07/08 13:01:28 UTC, 3 replies.
- Problem in Understanding concept of Physical Cores - posted by Aniruddh Sharma <as...@gmail.com> on 2015/07/08 13:21:21 UTC, 7 replies.
- Out of Memory Errors on less number of cores in proportion to Partitions in Data - posted by Aniruddh Sharma <as...@gmail.com> on 2015/07/08 13:21:48 UTC, 4 replies.
- Re: akka.remote.transport.Transport$InvalidAssociationException: The remote system terminated the association because it is shutting down - posted by maxdml <ma...@gmail.com> on 2015/07/08 13:56:04 UTC, 0 replies.
- Announcement of the webinar in the newsletter and on the site - posted by Oleh Rozvadovskyy <or...@cybervisiontech.com> on 2015/07/08 14:25:46 UTC, 0 replies.
- Getting started with spark-scala developemnt in eclipse. - posted by "Prateek ." <pr...@aricent.com> on 2015/07/08 15:38:46 UTC, 3 replies.
- foreachRDD vs. forearchPartition ? - posted by dgoldenberg <dg...@gmail.com> on 2015/07/08 15:42:58 UTC, 9 replies.
- Connecting to nodes on cluster - posted by Ashish Dutt <as...@gmail.com> on 2015/07/08 16:01:01 UTC, 4 replies.
- Kryo Serializer on Worker doesn't work by default. - posted by Eugene Morozov <fa...@list.ru> on 2015/07/08 16:40:05 UTC, 1 replies.
- PySpark without PySpark - posted by Julian <Ju...@Magnetic.com> on 2015/07/08 16:46:50 UTC, 12 replies.
- Jobs with unknown origin. - posted by Jan-Paul Bultmann <ja...@me.com> on 2015/07/08 17:22:23 UTC, 0 replies.
- Streaming checkpoints and logic change - posted by Jong Wook Kim <jo...@nyu.edu> on 2015/07/08 19:02:45 UTC, 3 replies.
- spark core/streaming doubts - posted by Shushant Arora <sh...@gmail.com> on 2015/07/08 19:26:31 UTC, 1 replies.
- Re: Reading Avro files from Streaming - posted by harris <r....@yahoo.com> on 2015/07/08 19:58:57 UTC, 0 replies.
- Create RDD from output of unix command - posted by foobar <he...@fb.com> on 2015/07/08 20:02:36 UTC, 4 replies.
- spark benchmarking - posted by "MrAsanjar ." <af...@gmail.com> on 2015/07/08 20:23:16 UTC, 2 replies.
- Communication between driver, cluster and HiveServer - posted by Eric Pederson <er...@gmail.com> on 2015/07/08 20:27:08 UTC, 1 replies.
- pause and resume streaming app - posted by Shushant Arora <sh...@gmail.com> on 2015/07/08 20:52:36 UTC, 1 replies.
- Job completed successfully without processing anything - posted by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com on 2015/07/08 21:32:23 UTC, 1 replies.
- RDD saveAsTextFile() to local disk - posted by Vijay Pawnarkar <vi...@gmail.com> on 2015/07/08 21:52:21 UTC, 3 replies.
- Disable heartbeat messages in REPL - posted by Lincoln Atkinson <la...@microsoft.com> on 2015/07/08 23:01:56 UTC, 3 replies.
- Requirement failed: Some of the DStreams have different slide durations - posted by anshu shukla <an...@gmail.com> on 2015/07/08 23:21:54 UTC, 0 replies.
- Real-time data visualization with Zeppelin - posted by "Ganelin, Ilya" <Il...@capitalone.com> on 2015/07/08 23:23:42 UTC, 3 replies.
- Remote spark-submit not working with YARN - posted by jegordon <jg...@gmail.com> on 2015/07/09 00:19:54 UTC, 3 replies.
- Re: FW: MLLIB (Spark) Question. - posted by DB Tsai <db...@dbtsai.com> on 2015/07/09 00:34:33 UTC, 0 replies.
- What does RDD lineage refer to ? - posted by canan chen <cc...@gmail.com> on 2015/07/09 02:21:45 UTC, 1 replies.
- Spark program throws NIO Buffer over flow error (TDigest - Ted Dunning lib) - posted by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com on 2015/07/09 03:50:03 UTC, 2 replies.
- Error while taking union - posted by anshu shukla <an...@gmail.com> on 2015/07/09 03:51:43 UTC, 0 replies.
- DLL load failed: %1 is not a valid win32 application on invoking pyspark - posted by Ashish Dutt <as...@gmail.com> on 2015/07/09 04:51:29 UTC, 2 replies.
- 回复:Re: how to use DoubleRDDFunctions on mllib Vector? - posted by prosp4300 <pr...@163.com> on 2015/07/09 06:02:13 UTC, 0 replies.
- Spark query - posted by Ravisankar Mani <rr...@gmail.com> on 2015/07/09 06:02:44 UTC, 3 replies.
- SparkR dataFrame read.df fails to read from aws s3 - posted by Ben Spark <be...@yahoo.com.au> on 2015/07/09 06:14:17 UTC, 2 replies.
- Using Hive UDF in spark - posted by vinod kumar <vi...@gmail.com> on 2015/07/09 06:33:51 UTC, 1 replies.
- Re: Writing data to hbase using Sparkstreaming - posted by Ted Yu <yu...@gmail.com> on 2015/07/09 06:33:59 UTC, 0 replies.
- S3 vs HDFS - posted by Brandon White <bw...@gmail.com> on 2015/07/09 08:35:50 UTC, 6 replies.
- spark streaming performance - posted by Michel Hubert <mi...@vsnsystemen.nl> on 2015/07/09 09:21:29 UTC, 8 replies.
- spark ec2 as non-root / any plan to improve that in the future ? - posted by matd <ma...@gmail.com> on 2015/07/09 09:24:52 UTC, 2 replies.
- Questions about Fault tolerance of Spark - posted by 牛兆捷 <nz...@gmail.com> on 2015/07/09 10:19:26 UTC, 0 replies.
- WindowsError: [Error 2] The system cannot find the file specified - posted by ashishdutt <as...@gmail.com> on 2015/07/09 11:07:00 UTC, 0 replies.
- Spark Streaming Hangs on Start - posted by Bin Wang <wb...@gmail.com> on 2015/07/09 11:29:24 UTC, 5 replies.
- Breaking lineage and reducing stages in Spark Streaming - posted by Anand Nalya <an...@gmail.com> on 2015/07/09 11:48:09 UTC, 9 replies.
- Spark Mesos task rescheduling - posted by besil <sb...@beintoo.com> on 2015/07/09 12:32:09 UTC, 2 replies.
- query on Spark + Flume integration using push model - posted by diplomatic Guru <di...@gmail.com> on 2015/07/09 13:05:18 UTC, 2 replies.
- Some BlockManager Doubts - posted by Dibyendu Bhattacharya <di...@gmail.com> on 2015/07/09 13:17:14 UTC, 1 replies.
- Scheduler delay vs. Getting result time - posted by Hans van den Bogert <ha...@gmail.com> on 2015/07/09 13:57:04 UTC, 1 replies.
- databases currently supported by Spark SQL JDBC - posted by Niranda Perera <ni...@gmail.com> on 2015/07/09 14:09:47 UTC, 1 replies.
- Data Processing speed SQL Vs SPARK - posted by vinod kumar <vi...@gmail.com> on 2015/07/09 14:28:04 UTC, 7 replies.
- Re: Spark 1.4.0 - Using SparkR on EC2 Instance - posted by RedOakMark <ma...@redoakstrategic.com> on 2015/07/09 15:06:21 UTC, 0 replies.
- change default storage level - posted by Michal Čizmazia <mi...@gmail.com> on 2015/07/09 16:09:38 UTC, 2 replies.
- Accessing Spark Web UI from another place than where the job actually ran - posted by rroxanaioana <rr...@gmail.com> on 2015/07/09 16:14:25 UTC, 2 replies.
- Number of Threads in Executor to process Tasks - posted by Aniruddh Sharma <as...@gmail.com> on 2015/07/09 16:37:17 UTC, 0 replies.
- [SparkSQL] Incorrect ROLLUP results - posted by Yana Kadiyska <ya...@gmail.com> on 2015/07/09 17:00:04 UTC, 4 replies.
- GraphX Synth Benchmark - posted by AshutoshRaghuvanshi <as...@gmail.com> on 2015/07/09 17:37:29 UTC, 1 replies.
- Spark serialization in closure - posted by Chen Song <ch...@gmail.com> on 2015/07/09 18:04:46 UTC, 6 replies.
- spark streaming kafka compatibility - posted by Shushant Arora <sh...@gmail.com> on 2015/07/09 18:10:35 UTC, 3 replies.
- DataFrame insertInto fails, saveAsTable works (Azure HDInsight) - posted by Daniel Haviv <da...@veracity-group.com> on 2015/07/09 18:12:41 UTC, 1 replies.
- orderBy + cache is invoking work on mesos cluster - posted by Corey Stubbs <cs...@us.ibm.com> on 2015/07/09 18:52:20 UTC, 0 replies.
- How to ignore features in mllib - posted by Arun Luthra <ar...@gmail.com> on 2015/07/09 19:38:30 UTC, 1 replies.
- What is a best practice for passing environment variables to Spark workers? - posted by dgoldenberg <dg...@gmail.com> on 2015/07/09 20:36:15 UTC, 1 replies.
- What is faster for SQL table storage, On-Heap or off-heap? - posted by Brandon White <bw...@gmail.com> on 2015/07/09 21:00:51 UTC, 0 replies.
- Friend recommendation using collaborative filtering? - posted by "Diogo B." <di...@5pm.de> on 2015/07/09 21:17:54 UTC, 0 replies.
- Does spark guarantee that the same task will process the same key over time? - posted by micvog <mi...@micvog.com> on 2015/07/09 21:30:36 UTC, 1 replies.
- SPARK_WORKER_DIR and SPARK_LOCAL_DIR - posted by corrius <co...@gmail.com> on 2015/07/09 21:47:29 UTC, 0 replies.
- Pyspark not working on yarn-cluster mode - posted by jegordon <jg...@gmail.com> on 2015/07/09 23:23:17 UTC, 3 replies.
- [X-post] Saving SparkSQL result RDD to Cassandra - posted by Su She <su...@gmail.com> on 2015/07/09 23:24:00 UTC, 2 replies.
- work around Size exceeds Integer.MAX_VALUE - posted by Michal Čizmazia <mi...@gmail.com> on 2015/07/09 23:50:58 UTC, 4 replies.
- How to specify PATHS for user defined functions. - posted by Dan Dong <do...@gmail.com> on 2015/07/10 00:07:34 UTC, 0 replies.
- Apache Spark : Custom function for reduceByKey - missing arguments for method - posted by ameyamm <am...@outlook.com> on 2015/07/10 03:09:52 UTC, 1 replies.
- [Spark Hive SQL] Set the hive connection in hive context is broken in spark 1.4.1-rc1? - posted by Terry Hole <hu...@gmail.com> on 2015/07/10 05:59:48 UTC, 2 replies.
- SPARK vs SQL - posted by vinod kumar <vi...@gmail.com> on 2015/07/10 06:33:50 UTC, 0 replies.
- Caching in spark - posted by vinod kumar <vi...@gmail.com> on 2015/07/10 06:35:02 UTC, 3 replies.
- Performance slow - posted by Ravisankar Mani <rr...@gmail.com> on 2015/07/10 06:40:26 UTC, 0 replies.
- Numer of "runJob at SparkPlan.scala:122" in Spark SQL - posted by Wojciech Pituła <w....@gmail.com> on 2015/07/10 07:14:19 UTC, 0 replies.
- Spark Shell "No suitable driver found" error - posted by satish chandra j <js...@gmail.com> on 2015/07/10 08:12:10 UTC, 0 replies.
- HiveContext with Cloudera Pseudo Cluster - posted by Sukhmeet Sethi <su...@absolutdata.com> on 2015/07/10 10:32:25 UTC, 0 replies.
- RE:Building scaladoc using "build/sbt unidoc" failure - posted by MEETHU MATHEW <me...@yahoo.co.in> on 2015/07/10 10:38:35 UTC, 0 replies.
- Ipython notebook, ec2 spark cluster and matplotlib - posted by Marco Didonna <m....@gmail.com> on 2015/07/10 10:48:08 UTC, 0 replies.
- reduceByKeyAndWindow with initial state - posted by Imran Alam <im...@newscred.com> on 2015/07/10 11:07:30 UTC, 2 replies.
- Saving RDD into cassandra keyspace. - posted by "Prateek ." <pr...@aricent.com> on 2015/07/10 11:24:43 UTC, 2 replies.
- K Nearest Neighbours - posted by Carsten Schnober <sc...@ukp.informatik.tu-darmstadt.de> on 2015/07/10 12:02:10 UTC, 1 replies.
- SelectChannelConnector@0.0.0.0:4040: java.net.BindException: Address already in use when running spark-shell - posted by "Prateek ." <pr...@aricent.com> on 2015/07/10 12:24:56 UTC, 0 replies.
- Re: SelectChannelConnector@0.0.0.0:4040: java.net.BindException: Address already in use when running spark-shell - posted by Akhil Das <ak...@sigmoidanalytics.com> on 2015/07/10 12:31:59 UTC, 1 replies.
- Starting Spark-Application without explicit submission to cluster? - posted by algermissen1971 <al...@icloud.com> on 2015/07/10 12:37:47 UTC, 2 replies.
- spark-submit - posted by AshutoshRaghuvanshi <as...@gmail.com> on 2015/07/10 12:42:54 UTC, 1 replies.
- Spark performance - posted by Ravisankar Mani <rr...@gmail.com> on 2015/07/10 12:49:39 UTC, 10 replies.
- How to restrict disk space for spark caches on yarn? - posted by Peter Rudenko <pe...@gmail.com> on 2015/07/10 12:51:39 UTC, 3 replies.
- Best way to avoid updateStateByKey from running without data - posted by micvog <mi...@micvog.com> on 2015/07/10 13:30:21 UTC, 0 replies.
- Re: Issues when combining Spark and a third party java library - posted by maxdml <ma...@gmail.com> on 2015/07/10 15:21:18 UTC, 3 replies.
- Debug Spark Streaming in PyCharm - posted by blbradley <br...@gmail.com> on 2015/07/10 15:28:21 UTC, 1 replies.
- Spark Broadcasting large dataset - posted by huanglr <hu...@CeBiTec.Uni-Bielefeld.DE> on 2015/07/10 15:52:42 UTC, 2 replies.
- Fwd: SparkSQL Postgres balanced partition of DataFrames - posted by Moises Baly <mo...@urban4m.com> on 2015/07/10 16:04:37 UTC, 0 replies.
- SparkDriverExecutionException when using actorStream - posted by Juan Rodríguez Hortalá <ju...@gmail.com> on 2015/07/10 18:27:45 UTC, 1 replies.
- SparkR Error in sparkR.init(master=“local”) in RStudio - posted by kachau <um...@gmail.com> on 2015/07/10 18:30:13 UTC, 3 replies.
- dataFrame.colaesce(1) or dataFrame.reapartition(1) does not seem work for me - posted by kachau <um...@gmail.com> on 2015/07/10 18:48:20 UTC, 0 replies.
- Spark on Tomcat has exception IncompatibleClassChangeError: Implementing class - posted by Zoran Jeremic <zo...@gmail.com> on 2015/07/10 19:09:43 UTC, 7 replies.
- Unit tests of spark application - posted by Naveen Madhire <vm...@umail.iu.edu> on 2015/07/10 19:41:55 UTC, 5 replies.
- SparkHub: a new community site for Apache Spark - posted by Patrick Wendell <pw...@gmail.com> on 2015/07/10 20:33:37 UTC, 0 replies.
- Spark Streaming - Inserting into Tables - posted by Brandon White <bw...@gmail.com> on 2015/07/10 20:55:51 UTC, 3 replies.
- Spark Streaming and using Swift object store for checkpointing - posted by algermissen1971 <al...@icloud.com> on 2015/07/10 23:10:17 UTC, 1 replies.
- SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder" - posted by Mulugeta Mammo <mu...@gmail.com> on 2015/07/11 00:13:14 UTC, 1 replies.
- JAR containing org.apache.hadoop.mapreduce.lib.input.FileInputFormat - posted by Lincoln Atkinson <la...@microsoft.com> on 2015/07/11 00:42:22 UTC, 0 replies.
- Re: JAR containing org.apache.hadoop.mapreduce.lib.input.FileInputFormat - posted by Ted Yu <yu...@gmail.com> on 2015/07/11 03:03:19 UTC, 0 replies.
- Re: Spark GraphX memory requirements + java.lang.OutOfMemoryError: GC overhead limit exceeded - posted by Roman Sokolov <ol...@gmail.com> on 2015/07/11 03:58:20 UTC, 0 replies.
- Linear search between particular log4j log lines - posted by ssbiox <se...@gmail.com> on 2015/07/11 04:48:59 UTC, 1 replies.
- Ordering of Batches in Spark streaming - posted by anshu shukla <an...@gmail.com> on 2015/07/11 04:50:10 UTC, 4 replies.
- Is it possible to change the default port number 7077 for spark? - posted by ashishdutt <as...@gmail.com> on 2015/07/11 05:50:34 UTC, 6 replies.
- spark streaming doubt - posted by Shushant Arora <sh...@gmail.com> on 2015/07/11 12:00:10 UTC, 3 replies.
- Rdd partitioning - posted by anshu shukla <an...@gmail.com> on 2015/07/11 12:02:41 UTC, 0 replies.
- Worker dies with java.io.IOException: Stream closed - posted by gaurav sharma <sh...@gmail.com> on 2015/07/11 19:48:17 UTC, 2 replies.
- Re: Sum elements of an iterator inside an RDD - posted by "leonida.gianfagna" <le...@gmail.com> on 2015/07/11 20:02:18 UTC, 1 replies.
- Calculating moving average of dataset in Apache Spark and Scala - posted by Anupam Bagchi <an...@rocketmail.com> on 2015/07/12 08:10:42 UTC, 0 replies.
- Moving average using Spark and Scala - posted by Anupam Bagchi <an...@rocketmail.com> on 2015/07/12 08:44:28 UTC, 0 replies.
- Including additional scala libraries in sparkR - posted by Michal Haris <mi...@visualdna.com> on 2015/07/12 12:39:07 UTC, 4 replies.
- Re: createDirectStream and Stats - posted by gaurav sharma <sh...@gmail.com> on 2015/07/12 13:30:26 UTC, 1 replies.
- Few basic spark questions - posted by Oded Maimon <od...@scene53.com> on 2015/07/12 15:49:04 UTC, 6 replies.
- How can the RegressionMetrics produce negative R2 and explained variance? - posted by afarahat <ay...@yahoo.com> on 2015/07/12 17:22:46 UTC, 0 replies.
- Re: How can the RegressionMetrics produce negative R2 and explained variance? - posted by Sean Owen <so...@cloudera.com> on 2015/07/12 17:37:45 UTC, 1 replies.
- Master vs. Slave Nodes Clarification - posted by algermissen1971 <al...@icloud.com> on 2015/07/12 21:34:44 UTC, 5 replies.
- Spark Standalone Mode not working in a cluster - posted by Eduardo <er...@gmail.com> on 2015/07/13 01:04:30 UTC, 2 replies.
- SparkSQL 'describe table' tries to look at all records - posted by Jerrick Hoang <je...@gmail.com> on 2015/07/13 02:03:08 UTC, 5 replies.
- Re: javaRDD.saveasTextfile saves each line enclosed by square brackets - posted by dineh210 <di...@logic4g.com> on 2015/07/13 08:29:10 UTC, 0 replies.
- Spark Parallelism - posted by Jem Tucker <je...@gmail.com> on 2015/07/13 09:26:21 UTC, 0 replies.
- Using sparkR of spark1.4 - posted by 秦召红 <zh...@siat.ac.cn> on 2015/07/13 11:19:57 UTC, 0 replies.
- Spark issue with running CrossValidator with RandomForestClassifier on dataset - posted by shivamverma <sh...@gmail.com> on 2015/07/13 11:45:01 UTC, 1 replies.
- [MLLib][Kmeans] KMeansModel.computeCost takes lot of time - posted by Nirmal Fernando <ni...@wso2.com> on 2015/07/13 11:53:08 UTC, 7 replies.
- Spark Intro - posted by vinod kumar <vi...@gmail.com> on 2015/07/13 12:24:37 UTC, 6 replies.
- Stopping StreamingContext before receiver has started - posted by Juan Rodríguez Hortalá <ju...@gmail.com> on 2015/07/13 12:35:01 UTC, 1 replies.
- MovieALS Implicit Error - posted by bliang <bl...@thecarousell.com> on 2015/07/13 12:55:35 UTC, 5 replies.
- Re: Velox Model Server - posted by Nick Pentreath <ni...@gmail.com> on 2015/07/13 13:33:25 UTC, 0 replies.
- Duplicated UnusedStubClass in assembly - posted by Luis Ángel Vicente Sánchez <la...@gmail.com> on 2015/07/13 13:55:17 UTC, 2 replies.
- Share RDD from SparkR and another application - posted by harirajaram <ha...@gmail.com> on 2015/07/13 14:30:46 UTC, 3 replies.
- Do SparkSQL support subquery? - posted by Louis Hust <lo...@gmail.com> on 2015/07/13 15:10:07 UTC, 1 replies.
- [SPARK-SQL] Window Functions optimization - posted by Hao Ren <in...@gmail.com> on 2015/07/13 15:37:38 UTC, 2 replies.
- Re: sparkR - posted by ashishdutt <as...@gmail.com> on 2015/07/13 16:02:40 UTC, 1 replies.
- Adaptive behavior of Spark at different network transfer rates? - posted by Niklas Wilcke <1w...@informatik.uni-hamburg.de> on 2015/07/13 16:21:04 UTC, 0 replies.
- cache() VS cacheTable() - posted by Srikanth <sr...@gmail.com> on 2015/07/13 17:54:47 UTC, 0 replies.
- Does Spark Streaming support streaming from a database table? - posted by unk1102 <um...@gmail.com> on 2015/07/13 18:13:26 UTC, 2 replies.
- Problems after upgrading to spark 1.4.0 - posted by Luis Ángel Vicente Sánchez <la...@gmail.com> on 2015/07/13 18:15:25 UTC, 3 replies.
- java.io.InvalidClassException - posted by Sa...@wellsfargo.com on 2015/07/13 18:32:18 UTC, 4 replies.
- Finding moving average using Spark and Scala - posted by Anupam Bagchi <an...@rocketmail.com> on 2015/07/13 19:07:05 UTC, 8 replies.
- Does Spark driver talk to NameNode directly or Yarn Resource Manager talks to NameNode to know the nodes which has required input blocks and informs Spark Driver ? (for launching Executors on nodes which has required input data blocks) - posted by Elkhan Dadashov <el...@gmail.com> on 2015/07/13 19:10:44 UTC, 1 replies.
- Spark off heap memory leak on Yarn with Kafka direct stream - posted by Apoorva Sareen <ap...@gmail.com> on 2015/07/13 20:05:53 UTC, 2 replies.
- Language support for Spark libraries - posted by Lincoln Atkinson <la...@microsoft.com> on 2015/07/13 20:06:46 UTC, 1 replies.
- fileStream with old files - posted by automaticgiant <hu...@rackspace.com> on 2015/07/13 22:44:27 UTC, 4 replies.
- HDFS performances + unexpected death of executors. - posted by maxdml <ma...@gmail.com> on 2015/07/13 22:48:29 UTC, 1 replies.
- Re: How to make my spark implementation parallel? - posted by maxdml <ma...@gmail.com> on 2015/07/13 23:03:36 UTC, 1 replies.
- MLLIB RDD segmentation for logistic regression - posted by Sa...@wellsfargo.com on 2015/07/13 23:30:01 UTC, 0 replies.
- How to set the heap size on consumers? - posted by dgoldenberg <dg...@gmail.com> on 2015/07/14 01:07:08 UTC, 0 replies.
- spark task hangs at BinaryClassificationMetrics (InetAddress related) - posted by Asher Krim <ak...@hubspot.com> on 2015/07/14 01:38:34 UTC, 0 replies.
- Basic Spark SQL question - posted by Ron Gonzalez <zl...@yahoo.com.INVALID> on 2015/07/14 02:34:01 UTC, 3 replies.
- [SparkScore] Performance portal for Apache Spark - WW28 - posted by "Huang, Jie" <ji...@intel.com> on 2015/07/14 02:45:34 UTC, 0 replies.
- HiveThriftServer2.startWithContext error with registerTempTable - posted by Srikanth <sr...@gmail.com> on 2015/07/14 03:14:49 UTC, 3 replies.
- hive-site.xml spark1.3 - posted by Jerrick Hoang <je...@gmail.com> on 2015/07/14 03:35:31 UTC, 1 replies.
- Upgrade Spark-1.3.0 to Spark-1.4.0 in CDH5.4 - posted by ashishdutt <as...@gmail.com> on 2015/07/14 04:07:52 UTC, 0 replies.
- How to speed up Spark process - posted by ๏̯͡๏ <ÐΞ€ρ@Ҝ>, de...@gmail.com on 2015/07/14 05:34:14 UTC, 10 replies.
- Research ideas using spark - posted by Shashidhar Rao <ra...@gmail.com> on 2015/07/14 07:18:06 UTC, 12 replies.
- RDD checkpoint - posted by 牛兆捷 <nz...@gmail.com> on 2015/07/14 07:35:39 UTC, 0 replies.
- Spark executor memory information - posted by Naveen Dabas <na...@ymail.com> on 2015/07/14 08:19:14 UTC, 1 replies.
- Re: Standalone mode connection failure from worker node to master - posted by sivarani <wh...@gmail.com> on 2015/07/14 09:56:17 UTC, 1 replies.
- spark submit configuration on yarn - posted by Pa Rö <pa...@googlemail.com> on 2015/07/14 10:43:27 UTC, 0 replies.
- Udf's in spark - posted by Ravisankar Mani <rr...@gmail.com> on 2015/07/14 11:30:21 UTC, 1 replies.
- java.lang.IllegalStateException: unread block data - posted by Arthur Chan <ar...@gmail.com> on 2015/07/14 12:32:56 UTC, 4 replies.
- About extra memory on yarn mode - posted by Sea <26...@qq.com> on 2015/07/14 14:44:48 UTC, 1 replies.
- No. of Task vs No. of Executors - posted by shahid <sh...@trialx.com> on 2015/07/14 15:43:54 UTC, 4 replies.
- 回复： Does Spark Streaming support streaming from a database table? - posted by focus <fo...@qq.com> on 2015/07/14 16:01:36 UTC, 0 replies.
- correct Scala Imports for creating DFs from RDDs? - posted by ashwang168 <as...@mit.edu> on 2015/07/14 16:57:09 UTC, 1 replies.
- How to maintain multiple JavaRDD created within another method like javaStreamRDD.forEachRDD - posted by unk1102 <um...@gmail.com> on 2015/07/14 17:41:50 UTC, 1 replies.
- ProcessBuilder in SparkLauncher is memory inefficient for launching new process - posted by Elkhan Dadashov <el...@gmail.com> on 2015/07/14 18:39:50 UTC, 1 replies.
- Why does SparkSubmit process takes so much virtual memory in yarn-cluster mode ? - posted by Elkhan Dadashov <el...@gmail.com> on 2015/07/14 18:53:45 UTC, 3 replies.
- spark on yarn - posted by Shushant Arora <sh...@gmail.com> on 2015/07/14 18:57:03 UTC, 12 replies.
- Spark on EMR with S3 example (Python) - posted by "Pagliari, Roberto" <rp...@appcomsci.com> on 2015/07/14 19:50:11 UTC, 4 replies.
- master compile broken for scala 2.11 - posted by Koert Kuipers <ko...@tresata.com> on 2015/07/14 20:22:40 UTC, 1 replies.
- To access elements of a org.apache.spark.mllib.linalg.Vector - posted by Dan Dong <do...@gmail.com> on 2015/07/14 21:23:58 UTC, 2 replies.
- Java 8 vs Scala - posted by spark user <sp...@yahoo.com.INVALID> on 2015/07/14 22:30:21 UTC, 14 replies.
- DataFrame.withColumn() recomputes columns even after cache() - posted by pnpritchard <ni...@falkonry.com> on 2015/07/14 22:30:21 UTC, 1 replies.
- Misaligned Rows with UDF - posted by pedro <sk...@gmail.com> on 2015/07/14 23:34:49 UTC, 0 replies.
- Data Frame for nested json - posted by spark user <sp...@yahoo.com.INVALID> on 2015/07/14 23:52:01 UTC, 0 replies.
- Re: spark streaming with kafka reset offset - posted by Chen Song <ch...@gmail.com> on 2015/07/15 00:00:05 UTC, 6 replies.
- Sessionization using updateStateByKey - posted by swetha <sw...@gmail.com> on 2015/07/15 01:13:29 UTC, 8 replies.
- Unable to use dynamicAllocation if spark.executor.instances is set in spark-defaults.conf - posted by "Kelly, Jonathan" <jo...@amazon.com> on 2015/07/15 01:23:23 UTC, 5 replies.
- Getting not implemented by the TFS FileSystem implementation - posted by Jerrick Hoang <je...@gmail.com> on 2015/07/15 01:28:38 UTC, 2 replies.
- SparkSQL 1.4 can't accept registration of UDF? - posted by ogoh <ok...@gmail.com> on 2015/07/15 02:10:44 UTC, 4 replies.
- Is IndexedRDD available in Spark 1.4.0? - posted by swetha <sw...@gmail.com> on 2015/07/15 02:18:34 UTC, 4 replies.
- Re: creating a distributed index - posted by swetha <sw...@gmail.com> on 2015/07/15 02:23:22 UTC, 4 replies.
- Sorted Multiple Outputs - posted by Yiannis Gkoufas <jo...@gmail.com> on 2015/07/15 02:23:36 UTC, 2 replies.
- How do you access a cached Spark SQL Table from a JBDC connection? - posted by Brandon White <bw...@gmail.com> on 2015/07/15 02:25:58 UTC, 3 replies.
- rest on streaming - posted by Chen Song <ch...@gmail.com> on 2015/07/15 03:57:56 UTC, 2 replies.
- MLlib LogisticRegressionWithLBFGS error - posted by Vi Ngo Van <ng...@gmail.com> on 2015/07/15 04:31:10 UTC, 2 replies.
- Using reference for RDD is safe? - posted by Abarah <se...@yahoo.com> on 2015/07/15 05:30:08 UTC, 2 replies.
- Efficiency of leftOuterJoin a cassandra rdd - posted by Wush Wu <wu...@gmail.com> on 2015/07/15 06:15:21 UTC, 6 replies.
- spark cache issue while doing saveAsTextFile and saveAsParquetFile - posted by mathewvinoj <vi...@hotmail.com> on 2015/07/15 07:03:07 UTC, 0 replies.
- Strange behavoir of pyspark with --jars option - posted by gen tang <ge...@gmail.com> on 2015/07/15 08:15:14 UTC, 1 replies.
- spark sql - group by constant column - posted by Lior Chaga <li...@taboola.com> on 2015/07/15 09:09:44 UTC, 1 replies.
- Random Forest Error - posted by rishikesh <ri...@hotmail.com> on 2015/07/15 09:30:10 UTC, 2 replies.
- [SparkR] creating dataframe from json file - posted by jianshu <ji...@gmail.com> on 2015/07/15 10:42:27 UTC, 3 replies.
- DataFrame InsertIntoJdbc() Runtime Exception on cluster - posted by Manohar753 <ma...@happiestminds.com> on 2015/07/15 11:13:15 UTC, 1 replies.
- Spark Stream suitability - posted by polariz <st...@gmail.com> on 2015/07/15 11:16:40 UTC, 0 replies.
- Re: Strange behavior of CoalescedRDD - posted by Konstantin Knizhnik <kn...@garret.ru> on 2015/07/15 11:16:56 UTC, 0 replies.
- what is metadata in StructField ? - posted by matd <ma...@gmail.com> on 2015/07/15 11:48:23 UTC, 1 replies.
- DataFrame.write().partitionBy("some_column").parquet(path) produces OutOfMemory with very few items - posted by Nikos Viorres <nv...@gmail.com> on 2015/07/15 13:05:40 UTC, 2 replies.
- Any beginner samples for using ML / MLIB to produce a moving average of a (K, iterable[V]) - posted by Nkechi Achara <nk...@googlemail.com> on 2015/07/15 13:30:30 UTC, 1 replies.
- compression behaviour inconsistency between 1.3 and 1.4 - posted by Marcin Cylke <ma...@ext.allegro.pl> on 2015/07/15 13:33:53 UTC, 0 replies.
- Spark and HDFS - posted by "Jeskanen, Elina" <el...@cgi.com> on 2015/07/15 14:36:59 UTC, 3 replies.
- updateStateByKey schedule time - posted by Michel Hubert <mi...@vsnsystemen.nl> on 2015/07/15 14:42:03 UTC, 1 replies.
- Running mllib from R in Spark 1.4 - posted by madhu phatak <ph...@gmail.com> on 2015/07/15 15:00:18 UTC, 2 replies.
- java heap error - posted by AlexG <sw...@gmail.com> on 2015/07/15 16:31:49 UTC, 0 replies.
- Job aborted due to stage failure: Task not serializable: - posted by Naveen Dabas <na...@ymail.com> on 2015/07/15 16:40:23 UTC, 1 replies.
- DataFrame more efficient than RDD? - posted by k0ala <k0...@gmail.com> on 2015/07/15 16:41:04 UTC, 1 replies.
- Strange Error: "java.lang.OutOfMemoryError: GC overhead limit exceeded" - posted by Saeed Shahrivari <sa...@gmail.com> on 2015/07/15 17:06:23 UTC, 2 replies.
- java.lang.NoClassDefFoundError: Could not initialize class org.fusesource.jansi.internal.Kernel32 - posted by "Wang, Ningjun (LNG-NPV)" <ni...@lexisnexis.com> on 2015/07/15 17:08:56 UTC, 1 replies.
- Tasks unevenly distributed in Spark 1.4.0 - posted by gisleyt <gi...@cxense.com> on 2015/07/15 17:48:04 UTC, 0 replies.
- out of memory error in treeAggregate - posted by AlexG <sw...@gmail.com> on 2015/07/15 18:01:45 UTC, 0 replies.
- Spark Accumulator Issue - java.io.IOException: java.lang.StackOverflowError - posted by Jadhav Shweta <ja...@tcs.com> on 2015/07/15 18:04:58 UTC, 1 replies.
- Spark 1.4.0 compute-classpath.sh - posted by lokeshkumar <lo...@dataken.net> on 2015/07/15 18:43:47 UTC, 2 replies.
- Spark job returns a different result on each run - posted by sbvarre <sb...@gmail.com> on 2015/07/15 18:59:54 UTC, 0 replies.
- Info from the event timeline appears to contradict dstat info - posted by Tom Hubregtsen <th...@gmail.com> on 2015/07/15 20:14:46 UTC, 1 replies.
- spark streaming job to hbase write - posted by Shushant Arora <sh...@gmail.com> on 2015/07/15 20:46:16 UTC, 6 replies.
- small accumulator gives out of memory error - posted by AlexG <sw...@gmail.com> on 2015/07/15 21:44:36 UTC, 0 replies.
- NotSerializableException in spark 1.4.0 - posted by Chen Song <ch...@gmail.com> on 2015/07/15 21:46:23 UTC, 6 replies.
- ALS run method versus ALS train versus ALS fit and transform - posted by Carol McDonald <cm...@maprtech.com> on 2015/07/15 22:55:00 UTC, 3 replies.
- Announcing Spark 1.4.1! - posted by Patrick Wendell <pw...@gmail.com> on 2015/07/15 23:48:28 UTC, 0 replies.
- get java.io.FileNotFoundException when use addFile Function - posted by prateek arora <pr...@gmail.com> on 2015/07/16 00:56:15 UTC, 1 replies.
- Python DataFrames, length of array - posted by pedro <sk...@gmail.com> on 2015/07/16 01:05:32 UTC, 0 replies.
- Python DataFrames: length of ArrayType - posted by pedro <sk...@gmail.com> on 2015/07/16 01:31:08 UTC, 1 replies.
- Spark cluster read local files - posted by Julien Beaudan <jb...@stottlerhenke.com> on 2015/07/16 02:20:52 UTC, 1 replies.
- Possible to combine all RDDs from a DStream batch into one? - posted by Jon Chase <jo...@gmail.com> on 2015/07/16 03:58:40 UTC, 3 replies.
- Spark streaming Processing time keeps increasing - posted by N B <nb...@gmail.com> on 2015/07/16 05:13:58 UTC, 7 replies.
- HiBench test for hadoop/hive/spark cluster - posted by lu...@sina.com on 2015/07/16 05:53:34 UTC, 1 replies.
- 回复：Re: HiBench test for hadoop/hive/spark cluster - posted by lu...@sina.com on 2015/07/16 06:38:29 UTC, 0 replies.
- Running foreach on a list of rdds in parallel - posted by Brandon White <bw...@gmail.com> on 2015/07/16 07:37:27 UTC, 2 replies.
- Indexed Store for lookup table - posted by Jem Tucker <je...@gmail.com> on 2015/07/16 10:00:30 UTC, 5 replies.
- Will multiple filters on the same RDD optimized to one filter? - posted by Bin Wang <wb...@gmail.com> on 2015/07/16 10:02:59 UTC, 3 replies.
- spark-streaming whit flume run error under yarn model - posted by 鹰 <98...@qq.com> on 2015/07/16 10:32:56 UTC, 0 replies.
- S3 Read / Write makes executors deadlocked - posted by Hao Ren <in...@gmail.com> on 2015/07/16 11:39:49 UTC, 1 replies.
- Spark 1.4.0 org.apache.spark.sql.AnalysisException: cannot resolve 'probability' given input columns - posted by lokeshkumar <lo...@dataken.net> on 2015/07/16 12:41:32 UTC, 0 replies.
- Invalid HDFS path exception - posted by wazza <ra...@gmail.com> on 2015/07/16 12:45:34 UTC, 0 replies.
- DataFrame from RDD[Row] - posted by Marius Danciu <ma...@gmail.com> on 2015/07/16 13:26:45 UTC, 0 replies.
- Use rank with distribute by in HiveContext - posted by Lior Chaga <li...@taboola.com> on 2015/07/16 14:10:58 UTC, 2 replies.
- [SPARK][GRAPHX] 'Executor Deserialize Time' is too big - posted by Hlib Mykhailenko <hl...@inria.fr> on 2015/07/16 15:10:36 UTC, 0 replies.
- PairRDDFunctions and DataFrames - posted by Yana Kadiyska <ya...@gmail.com> on 2015/07/16 15:58:10 UTC, 1 replies.
- Select all columns except some - posted by Sa...@wellsfargo.com on 2015/07/16 16:57:57 UTC, 3 replies.
- Please add our meetup home page in Japan. - posted by Kousuke Saruta <sa...@oss.nttdata.co.jp> on 2015/07/16 17:11:46 UTC, 0 replies.
- pyspark 1.4 udf change date values - posted by Luis Guerra <lu...@gmail.com> on 2015/07/16 17:22:29 UTC, 2 replies.
- BroadCast on Interval ( eg every 10 min ) - posted by Ashish Soni <as...@gmail.com> on 2015/07/16 18:03:50 UTC, 2 replies.
- Resume checkpoint failed with Spark Streaming Kafka via createDirectStream under heavy reprocessing - posted by Nicolas Phung <ni...@gmail.com> on 2015/07/16 18:28:12 UTC, 14 replies.
- create HiveContext if available, otherwise SQLContext - posted by Koert Kuipers <ko...@tresata.com> on 2015/07/16 22:21:42 UTC, 7 replies.
- How to unpersist RDDs generated by ALS/MatrixFactorizationModel - posted by "Stahlman, Jonathan" <Jo...@capitalone.com> on 2015/07/16 23:18:39 UTC, 5 replies.
- Please add two groups to "Community" page - posted by Andrew Vykhodtsev <yo...@gmail.com> on 2015/07/16 23:29:03 UTC, 0 replies.
- Console log file of CoarseGrainedExecutorBackend - posted by Tao Lu <ta...@gmail.com> on 2015/07/17 00:43:55 UTC, 1 replies.
- Setting different amount of cache memory for driver - posted by "Zalzberg, Idan (Agoda)" <Id...@agoda.com> on 2015/07/17 04:35:09 UTC, 0 replies.
- [Spark Shell] Could the spark shell be reset to the original status? - posted by Terry Hole <hu...@gmail.com> on 2015/07/17 05:51:39 UTC, 2 replies.
- Retrieving Spark Configuration properties - posted by RajG <rj...@gmail.com> on 2015/07/17 05:53:11 UTC, 1 replies.
- spark-streaming failed to bind ip address - posted by 鹰 <98...@qq.com> on 2015/07/17 07:49:16 UTC, 0 replies.
- what is : ParquetFileReader: reading summary file ? - posted by sh...@tsmc.com on 2015/07/17 07:56:17 UTC, 1 replies.
- Nullpointer when saving as table with a timestamp column type - posted by Brandon White <bw...@gmail.com> on 2015/07/17 08:18:30 UTC, 0 replies.
- Spark Master HA on YARN - posted by Bhaskar Dutta <bh...@gmail.com> on 2015/07/17 09:29:52 UTC, 0 replies.
- Unread block data error - posted by Jem Tucker <je...@gmail.com> on 2015/07/17 09:39:02 UTC, 0 replies.
- Re: it seem like the exactly once feature not work on spark1.4 - posted by JoneZhang <jo...@gmail.com> on 2015/07/17 10:15:51 UTC, 1 replies.
- use S3-Compatible Storage with spark - posted by Schmirr Wurst <sc...@gmail.com> on 2015/07/17 10:36:58 UTC, 18 replies.
- Spark 1.3.1 + Hive: write output to CSV with header on S3 - posted by Roberto Coluccio <ro...@gmail.com> on 2015/07/17 11:29:08 UTC, 2 replies.
- 回复：Nullpointer when saving as table with a timestamp column type - posted by 鹰 <98...@qq.com> on 2015/07/17 11:31:56 UTC, 0 replies.
- Spark APIs memory usage? - posted by Harit Vishwakarma <ha...@gmail.com> on 2015/07/17 13:03:10 UTC, 7 replies.
- Is it possible to set the number of cores per executor on standalone cluster? - posted by "Zheng, Xudong" <do...@gmail.com> on 2015/07/17 13:53:02 UTC, 0 replies.
- Adding meetup groups to Community page - Moscow, Slovenia, Zagreb - posted by Andrew Vykhodtsev <yo...@gmail.com> on 2015/07/17 17:08:56 UTC, 0 replies.
- spark-shell with Yarn failed - posted by Amjad ALSHABANI <as...@gmail.com> on 2015/07/17 17:37:21 UTC, 4 replies.
- streaming and piping to R, sending all data in window to pipe() - posted by "PAULI, KEVIN CHRISTIAN [AG-Contractor/1000]" <ke...@monsanto.com> on 2015/07/17 18:21:22 UTC, 1 replies.
- exception raised during large spark job against cassandra ring - posted by Bosung Seo <bo...@brightcloud.com> on 2015/07/17 18:24:08 UTC, 0 replies.
- Spark and SQL Server - posted by "Young, Matthew T" <ma...@intel.com> on 2015/07/17 19:15:05 UTC, 4 replies.
- MapType vs StructType - posted by Corey Nolet <cj...@gmail.com> on 2015/07/17 20:02:07 UTC, 3 replies.
- Re: Store DStreams into Hive using Hive Streaming - posted by unk1102 <um...@gmail.com> on 2015/07/17 20:16:56 UTC, 0 replies.
- Has anyone run Python Spark application on Yarn-cluster mode ? (which has 3rd party Python modules (i.e., numpy) to be shipped with) - posted by Elkhan Dadashov <el...@gmail.com> on 2015/07/17 20:23:56 UTC, 0 replies.
- Data frames select and where clause dependency - posted by Mike Trienis <mi...@orcsol.com> on 2015/07/17 20:55:25 UTC, 5 replies.
- Model Save function (ML-Lib) - posted by Guillaume Guy <gu...@gmail.com> on 2015/07/17 21:25:58 UTC, 0 replies.
- What is "java.sql.SQLException: Unsupported type -101"? - posted by "Sambit Tripathy (RBEI/EDS1)" <Sa...@in.bosch.com> on 2015/07/17 23:01:06 UTC, 2 replies.
- Command builder problem when running worker in Windows - posted by Julien Beaudan <jb...@stottlerhenke.com> on 2015/07/17 23:40:55 UTC, 5 replies.
- Cleanup when tasks generate errors - posted by sim <si...@swoop.com> on 2015/07/18 03:14:34 UTC, 0 replies.
- Reading SequenceFiles from S3 with PySpark on EMR causes RACK_LOCAL locality - posted by Charles Menguy <me...@gmail.com> on 2015/07/18 03:16:22 UTC, 1 replies.
- Re: Hash Partitioning and Dataframes - posted by Stephen Boesch <ja...@gmail.com> on 2015/07/18 06:22:29 UTC, 0 replies.
- Re: Flatten list - posted by Gylfi <gy...@berkeley.edu> on 2015/07/18 09:21:47 UTC, 0 replies.
- Re: Spark same execution time on 1 node and 5 nodes - posted by Gylfi <gy...@berkeley.edu> on 2015/07/18 09:56:54 UTC, 0 replies.
- Re: write a HashMap to HDFS in Spark - posted by Gylfi <gy...@berkeley.edu> on 2015/07/18 10:51:49 UTC, 0 replies.
- PicklingError: Could not pickle object as excessively deep recursion required. - posted by Andrej Burja <an...@gmail.com> on 2015/07/18 10:56:42 UTC, 0 replies.
- Re: Passing Broadcast variable as parameter - posted by Gylfi <gy...@berkeley.edu> on 2015/07/18 11:03:44 UTC, 0 replies.
- BigQuery connector for pyspark via Hadoop Input Format example - posted by lfiaschi <lu...@gmail.com> on 2015/07/18 13:19:28 UTC, 0 replies.
- Using Dataframe write with newHdoopApi - posted by ayan guha <gu...@gmail.com> on 2015/07/18 15:19:12 UTC, 5 replies.
- Re: How to extract complex JSON structures using Apache Spark 1.4.0 Data Frames - posted by Naveen Madhire <vm...@umail.iu.edu> on 2015/07/18 18:46:28 UTC, 0 replies.
- Spark-hive parquet schema evolution - posted by Jerrick Hoang <je...@gmail.com> on 2015/07/18 20:11:03 UTC, 6 replies.
- [General Question] [Hadoop + Spark at scale] Spark Rack Awareness ? - posted by Mike Frampton <mi...@hotmail.com> on 2015/07/19 03:25:31 UTC, 1 replies.
- Spark1.4 application throw java.lang.NoClassDefFoundError: javax/servlet/FilterRegistration - posted by Wwh 吴 <ww...@hotmail.com> on 2015/07/19 05:08:25 UTC, 1 replies.
- How to restart Twitter spark stream - posted by Zoran Jeremic <zo...@gmail.com> on 2015/07/19 07:39:57 UTC, 9 replies.
- spark1.4.0, streaming + kafka, throw org.apache.spark.util.TaskCompletionListenerException - posted by Wwh 吴 <ww...@hotmail.com> on 2015/07/19 11:41:27 UTC, 0 replies.
- XML Parsing - posted by Ashish Soni <as...@gmail.com> on 2015/07/19 19:38:18 UTC, 1 replies.
- Counting distinct values for a key? - posted by N B <nb...@gmail.com> on 2015/07/19 20:28:31 UTC, 7 replies.
- Exception while triggering spark job from remote jvm - posted by ankit tyagi <an...@gmail.com> on 2015/07/19 21:03:55 UTC, 2 replies.
- DataFrame Union not passing optimizer assertion - posted by Brandon White <bw...@gmail.com> on 2015/07/19 21:37:31 UTC, 0 replies.
- [SparkScore] Performance portal for Apache Spark - WW29 - posted by "Huang, Jie" <ji...@intel.com> on 2015/07/20 02:51:29 UTC, 0 replies.
- assertion failed error with GraphX - posted by Jack Yang <ji...@uow.edu.au> on 2015/07/20 04:15:44 UTC, 1 replies.
- how to start reading the spark source code? - posted by Yang <te...@gmail.com> on 2015/07/20 04:44:30 UTC, 4 replies.
- Spark Mesos Dispatcher - posted by "Jahagirdar, Madhu" <ma...@philips.com> on 2015/07/20 04:52:11 UTC, 4 replies.
- Toronto Apache Spark - posted by Mehrdad Pazooki <pa...@gmail.com> on 2015/07/20 05:25:56 UTC, 0 replies.
- Error while Partitioning - posted by rishikesh <ri...@hotmail.com> on 2015/07/20 06:53:32 UTC, 0 replies.
- Looking for a few Spark Benchmarks - posted by Steve Lewis <lo...@gmail.com> on 2015/07/20 07:03:57 UTC, 0 replies.
- Re: Kmeans Labeled Point RDD - posted by plazaster <mi...@gmail.com> on 2015/07/20 08:37:41 UTC, 1 replies.
- Hive Query(Top N) - posted by Ravisankar Mani <rr...@gmail.com> on 2015/07/20 08:52:06 UTC, 0 replies.
- 1.4.1 in production - posted by "igor.berman" <ig...@gmail.com> on 2015/07/20 10:03:52 UTC, 0 replies.
- Local Repartition - posted by Daniel Haviv <da...@veracity-group.com> on 2015/07/20 11:04:55 UTC, 4 replies.
- LDA on a large dataset - posted by Peter Zvirinsky <pe...@gmail.com> on 2015/07/20 12:21:46 UTC, 1 replies.
- Proper saving/loading of MatrixFactorizationModel - posted by Petr Shestov <ps...@nvidia.com> on 2015/07/20 12:26:00 UTC, 2 replies.
- PySpark Nested Json Parsing - posted by Ajay <aj...@gmail.com> on 2015/07/20 12:26:20 UTC, 3 replies.
- spark streaming 1.3 issues - posted by Shushant Arora <sh...@gmail.com> on 2015/07/20 13:22:13 UTC, 4 replies.
- Re: JdbcRDD and ClassTag issue - posted by nitinkalra2000 <ni...@gmail.com> on 2015/07/20 14:11:51 UTC, 0 replies.
- Apache Spark : spark.eventLog.dir on Windows Environment - posted by nitinkalra2000 <ni...@gmail.com> on 2015/07/20 14:15:24 UTC, 4 replies.
- Does Spark streaming support is there with RabbitMQ - posted by Jeetendra Gangele <ga...@gmail.com> on 2015/07/20 14:52:43 UTC, 4 replies.
- Web UI Links - posted by Bob Corsaro <rc...@gmail.com> on 2015/07/20 15:59:21 UTC, 1 replies.
- Joda Time best practice? - posted by algermissen1971 <al...@icloud.com> on 2015/07/20 16:37:10 UTC, 4 replies.
- k-means iteration not terminate - posted by Pa Rö <pa...@googlemail.com> on 2015/07/20 16:51:00 UTC, 2 replies.
- dataframes sql order by not total ordering - posted by Carol McDonald <cm...@maprtech.com> on 2015/07/20 18:25:08 UTC, 2 replies.
- Broadcast variables in R - posted by Serge Franchois <se...@altran.com> on 2015/07/20 19:00:06 UTC, 4 replies.
- Spark 1.4.1,MySQL and DataFrameReader.read.jdbc fun - posted by Aaron <aa...@gmail.com> on 2015/07/20 19:14:13 UTC, 0 replies.
- What is the correct syntax of using Spark streamingContext.fileStream()? - posted by unk1102 <um...@gmail.com> on 2015/07/20 19:40:38 UTC, 1 replies.
- Fwd: Silly question about building Spark 1.4.1 - posted by Michael Segel <ms...@hotmail.com> on 2015/07/20 21:55:39 UTC, 3 replies.
- Increment counter variable in RDD transformation function - posted by dl...@comcast.net on 2015/07/21 00:21:16 UTC, 1 replies.
- Is SPARK is the right choice for traditional OLAP query processing? - posted by "renga.kannan" <re...@gmail.com> on 2015/07/21 02:04:56 UTC, 3 replies.
- spark streaming 1.3 coalesce on kafkadirectstream - posted by Shushant Arora <sh...@gmail.com> on 2015/07/21 05:31:35 UTC, 1 replies.
- standalone to connect mysql - posted by Jack Yang <ji...@uow.edu.au> on 2015/07/21 06:31:08 UTC, 5 replies.
- log4j.xml bundled in jar vs log4.properties in spark/conf - posted by "igor.berman" <ig...@gmail.com> on 2015/07/21 09:57:25 UTC, 1 replies.
- user threads in executors - posted by Shushant Arora <sh...@gmail.com> on 2015/07/21 10:11:46 UTC, 6 replies.
- DataFrame writer removes fields which is null for all rows - posted by Hao Ren <in...@gmail.com> on 2015/07/21 11:09:22 UTC, 0 replies.
- Is there more information about spark shuffer-service - posted by JoneZhang <jo...@gmail.com> on 2015/07/21 11:16:53 UTC, 1 replies.
- Classifier for Big Data Mining - posted by Chintan Bhatt <ch...@charusat.ac.in> on 2015/07/21 11:22:24 UTC, 3 replies.
- Spark SQL/DDF's for production - posted by bipin <bi...@gmail.com> on 2015/07/21 11:27:44 UTC, 0 replies.
- LinearRegressionWithSGD Outputs NaN - posted by Naveen <na...@formcept.com> on 2015/07/21 11:59:31 UTC, 1 replies.
- Would driver shutdown cause app dead? - posted by ZhuGe <tc...@outlook.com> on 2015/07/21 12:07:29 UTC, 1 replies.
- Running driver app as a daemon - posted by algermissen1971 <al...@icloud.com> on 2015/07/21 12:53:59 UTC, 0 replies.
- writing/reading multiple Parquet files: Failed to merge incompatible data types StringType and StructType - posted by Krzysztof Zarzycki <k....@gmail.com> on 2015/07/21 13:11:43 UTC, 2 replies.
- Spark Application stuck retrying task failed on Java heap space? - posted by Romi Kuntsman <ro...@totango.com> on 2015/07/21 15:55:54 UTC, 0 replies.
- 1.4.0 classpath issue with spark-submit - posted by Michal Haris <mi...@visualdna.com> on 2015/07/21 16:04:20 UTC, 2 replies.
- SparkR sqlContext or sc not found in RStudio - posted by unk1102 <um...@gmail.com> on 2015/07/21 16:57:40 UTC, 4 replies.
- Convert Simple Kafka Consumer to standalone Spark JavaStream Consumer - posted by Hafsa Asif <ha...@matchinguu.com> on 2015/07/21 17:23:16 UTC, 1 replies.
- Spark Streaming Checkpointing solutions - posted by Emmanuel <fo...@gmail.com> on 2015/07/21 17:43:30 UTC, 2 replies.
- Accumulator value 0 in driver - posted by dl...@comcast.net on 2015/07/21 17:47:16 UTC, 3 replies.
- pyspark equivalent to Extends Serializable - posted by keegan <ke...@l-3com.com> on 2015/07/21 17:50:46 UTC, 0 replies.
- Timestamp functions for sqlContext - posted by Tal Rozen <ta...@scaleka.com> on 2015/07/21 18:04:10 UTC, 1 replies.
- Re: Spark MLlib instead of Mahout - collaborative filtering model - posted by Anas Sherwani <an...@gmail.com> on 2015/07/21 18:18:35 UTC, 0 replies.
- Add column to DF - posted by Stefan Panayotov <sp...@msn.com> on 2015/07/21 20:49:59 UTC, 2 replies.
- spark streaming disk hit - posted by "Abhishek R. Singh" <ab...@tetrationanalytics.com> on 2015/07/21 20:57:41 UTC, 2 replies.
- spark thrift server supports timeout? - posted by Judy Nash <ju...@exchange.microsoft.com> on 2015/07/21 21:31:20 UTC, 1 replies.
- Partition parquet data by ENUM column - posted by ankits <an...@gmail.com> on 2015/07/21 21:41:29 UTC, 6 replies.
- Which memory fraction is Spark using to compute RDDs that are not going to be persisted - posted by wdbaruni <wd...@gmail.com> on 2015/07/21 22:47:02 UTC, 1 replies.
- NullPointerException inside RDD when calling sc.textFile - posted by MorEru <hs...@gmail.com> on 2015/07/21 22:48:27 UTC, 1 replies.
- Spark spark.shuffle.memoryFraction has no affect - posted by wdbaruni <wd...@gmail.com> on 2015/07/21 22:50:20 UTC, 2 replies.
- Class Loading Issue - Spark Assembly and Application Provided - posted by Ashish Soni <as...@gmail.com> on 2015/07/21 22:53:31 UTC, 0 replies.
- Spark SQL Table Caching - posted by Brandon White <bw...@gmail.com> on 2015/07/21 22:59:28 UTC, 1 replies.
- How to share a Map among RDDS? - posted by Dan Dong <do...@gmail.com> on 2015/07/21 23:53:47 UTC, 5 replies.
- Question on Spark SQL for a directory - posted by Ron Gonzalez <zl...@yahoo.com.INVALID> on 2015/07/22 01:06:05 UTC, 1 replies.
- RowId unique key for Dataframes - posted by Srikanth <sr...@gmail.com> on 2015/07/22 01:55:15 UTC, 2 replies.
- 回复：Timestamp functions for sqlContext - posted by 鹰 <98...@qq.com> on 2015/07/22 04:07:09 UTC, 0 replies.
- java.lang.UnsatisfiedLinkError: no snappyjava in java.library.path - posted by stark_summer <st...@qq.com> on 2015/07/22 04:51:50 UTC, 0 replies.
- query over hive context hangs, please help - posted by 诺铁 <no...@gmail.com> on 2015/07/22 05:08:57 UTC, 0 replies.
- How to build Spark with my own version of Hadoop? - posted by Dogtail Ray <sp...@gmail.com> on 2015/07/22 05:11:36 UTC, 1 replies.
- many-to-many join - posted by John Berryman <jo...@eventbrite.com> on 2015/07/22 06:44:10 UTC, 2 replies.
- Need help in setting up spark cluster - posted by Jeetendra Gangele <ga...@gmail.com> on 2015/07/22 07:08:03 UTC, 3 replies.
- Use directories for partition pruning Spark SQL - posted by Johan Lundahl <jo...@gmail.com> on 2015/07/22 10:04:27 UTC, 0 replies.
- Mesos + Spark - posted by boci <bo...@gmail.com> on 2015/07/22 10:53:50 UTC, 5 replies.
- Is spark suitable for real time query - posted by Louis Hust <lo...@gmail.com> on 2015/07/22 12:14:21 UTC, 7 replies.
- Scaling spark cluster for a running application - posted by phagunbaya <ph...@falkonry.com> on 2015/07/22 13:20:21 UTC, 1 replies.
- Applications metrics unseparatable from Master metrics? - posted by Romi Kuntsman <ro...@totango.com> on 2015/07/22 14:38:10 UTC, 0 replies.
- Need help in SparkSQL - posted by Jeetendra Gangele <ga...@gmail.com> on 2015/07/22 14:47:41 UTC, 5 replies.
- java.lang.IllegalArgumentException: problem reading type: type = group, name = param, original type = null - posted by SkyFox <di...@gmail.com> on 2015/07/22 15:15:24 UTC, 0 replies.
- Re: Parquet problems - posted by Michael Misiewicz <mm...@gmail.com> on 2015/07/22 16:40:34 UTC, 4 replies.
- spark.files.userClassPathFirst=true Return Error - Please help - posted by Ashish Soni <as...@gmail.com> on 2015/07/22 17:03:02 UTC, 0 replies.
- spark.deploy.spreadOut core allocation - posted by Srikanth <sr...@gmail.com> on 2015/07/22 17:05:42 UTC, 3 replies.
- problems running Spark on a firewalled remote YARN cluster via SOCKS proxy - posted by rok <ro...@gmail.com> on 2015/07/22 17:26:32 UTC, 2 replies.
- Help accessing protected S3 - posted by Greg Anderson <gr...@familysearch.org> on 2015/07/22 18:59:47 UTC, 4 replies.
- How to keep RDDs in memory between two different batch jobs? - posted by swetha <sw...@gmail.com> on 2015/07/22 19:56:46 UTC, 4 replies.
- Performance issue with Spak's foreachpartition method - posted by diplomatic Guru <di...@gmail.com> on 2015/07/22 20:11:29 UTC, 4 replies.
- spark.executor.memory and spark.driver.memory have no effect in yarn-cluster mode (1.4.x)? - posted by Michael Misiewicz <mm...@gmail.com> on 2015/07/22 21:38:42 UTC, 2 replies.
- databricks spark sql csv FAILFAST not failing, Spark 1.3.1 Java 7 - posted by Adam Pritchard <ap...@gmail.com> on 2015/07/22 21:49:29 UTC, 0 replies.
- Spark DataFrame created from JavaRDD copies all columns data into first column - posted by unk1102 <um...@gmail.com> on 2015/07/22 22:03:47 UTC, 0 replies.
- Re: spark 1.3.1 : unable to access s3n:// urls (no file system for scheme s3n:) - posted by Eugene Morozov <fa...@list.ru> on 2015/07/22 22:22:22 UTC, 0 replies.
- spark-submit and spark-shell behaviors mismatch. - posted by Dan Dong <do...@gmail.com> on 2015/07/22 23:25:57 UTC, 3 replies.
- Issue with column named "count" in a DataFrame - posted by "Young, Matthew T" <ma...@intel.com> on 2015/07/23 00:04:52 UTC, 3 replies.
- No suitable driver found for jdbc:mysql:// - posted by roni <ro...@gmail.com> on 2015/07/23 00:45:00 UTC, 1 replies.
- Using Wurfl in Spark - posted by Zhongxiao Ma <zm...@4info.com> on 2015/07/23 01:24:04 UTC, 1 replies.
- Package Release Annoucement: Spark SQL on HBase "Astro" - posted by "Bing Xiao (Bing)" <bi...@huawei.com> on 2015/07/23 01:53:28 UTC, 3 replies.
- Comparison between Standalone mode and YARN mode - posted by Dogtail Ray <sp...@gmail.com> on 2015/07/23 01:56:49 UTC, 1 replies.
- ShuffledHashJoin instead of CartesianProduct - posted by Srikanth <sr...@gmail.com> on 2015/07/23 04:14:42 UTC, 0 replies.
- What if request cores are not satisfied - posted by "bit1129@163.com" <bi...@163.com> on 2015/07/23 04:23:10 UTC, 1 replies.
- Hive Session gets overwritten in ClientWrapper - posted by Vishak <vi...@gmail.com> on 2015/07/23 07:31:09 UTC, 1 replies.
- How to deal with the spark streaming application while upgrade spark - posted by JoneZhang <jo...@gmail.com> on 2015/07/23 09:51:31 UTC, 1 replies.
- SQL Server to Spark - posted by vinod kumar <vi...@gmail.com> on 2015/07/23 10:42:47 UTC, 1 replies.
- Facing problem in Oracle VM Virtual Box - posted by Chintan Bhatt <ch...@charusat.ac.in> on 2015/07/23 12:40:03 UTC, 1 replies.
- Asked to remove non-existent executor exception - posted by Pa Rö <pa...@googlemail.com> on 2015/07/23 13:41:56 UTC, 4 replies.
- Create table from local machine - posted by vinod kumar <vi...@gmail.com> on 2015/07/23 14:00:33 UTC, 0 replies.
- [MLLIB] Anyone tried correlation with RDD[Vector] ? - posted by Sa...@wellsfargo.com on 2015/07/23 14:37:55 UTC, 3 replies.
- Schedule lunchtime today for a free webinar "IoT data ingestion in Spark Streaming using Kaa" 11 a.m. PDT (2 p.m. EDT) - posted by Oleh Rozvadovskyy <or...@cybervisiontech.com> on 2015/07/23 16:48:11 UTC, 0 replies.
- Writing binary files in Spark - posted by Oren Shpigel <or...@yowza3d.com> on 2015/07/23 17:14:00 UTC, 5 replies.
- ERROR TaskResultGetter: Exception while getting task result when reading avro files that contain arrays - posted by Arbi Akhina <ar...@gmail.com> on 2015/07/23 18:02:08 UTC, 1 replies.
- Class weights and prediction probabilities in random forest? - posted by Patrick Crenshaw <pa...@crowdstrike.com> on 2015/07/23 18:26:02 UTC, 0 replies.
- Help with Dataframe syntax ( IN / COLLECT_SET) - posted by Yana Kadiyska <ya...@gmail.com> on 2015/07/23 20:35:45 UTC, 0 replies.
- Twitter4J streaming question - posted by pjmccarthy <pm...@eatonvance.com> on 2015/07/23 21:23:41 UTC, 5 replies.
- Fail to load hive tables through Spark - posted by Mithila Joshi <jo...@gmail.com> on 2015/07/23 22:20:28 UTC, 1 replies.
- java.lang.NoSuchMethodError for "list.toMap". - posted by Dan Dong <do...@gmail.com> on 2015/07/23 22:45:27 UTC, 2 replies.
- spark dataframe gc - posted by Mohit Jaggi <mo...@gmail.com> on 2015/07/24 00:03:40 UTC, 1 replies.
- Zeppelin notebook question - posted by Stefan Panayotov <sp...@msn.com> on 2015/07/24 01:37:46 UTC, 0 replies.
- [ Potential bug ] Spark terminal logs say that job has succeeded even though job has failed in Yarn cluster mode - posted by Elkhan Dadashov <el...@gmail.com> on 2015/07/24 01:43:15 UTC, 10 replies.
- Enabling mapreduce.input.fileinputformat.list-status.num-threads in Spark? - posted by Cheolsoo Park <pi...@gmail.com> on 2015/07/24 02:50:49 UTC, 0 replies.
- SparkR Supported Types - Please add "bigint" - posted by Exie <tf...@prodevelop.com.au> on 2015/07/24 04:25:41 UTC, 3 replies.
- Spark - Eclipse IDE - Maven - posted by Siva Reddy <ks...@gmail.com> on 2015/07/24 07:26:01 UTC, 9 replies.
- [POWERED BY] Please add our organization - posted by "Baxter, James" <JA...@woodside.com.au> on 2015/07/24 08:00:06 UTC, 0 replies.
- Encryption on RDDs or in-memory on Apache Spark - posted by IASIB1 <mo...@qub.ac.uk> on 2015/07/24 10:42:16 UTC, 1 replies.
- ERROR SparkUI: Failed to bind SparkUI java.net.BindException: Address already in use: Service 'SparkUI' failed after 16 retries! - posted by Joji John <jj...@ebates.com> on 2015/07/24 12:21:26 UTC, 4 replies.
- How do I query a DSE Cassandra table using Spark Job Server - posted by rsaggere <rs...@hotmail.com> on 2015/07/24 12:28:24 UTC, 0 replies.
- spark-ec2 credentials using aws_security_token - posted by ja...@centrum.cz on 2015/07/24 14:30:08 UTC, 0 replies.
- suggest coding platform - posted by Sa...@wellsfargo.com on 2015/07/24 14:36:21 UTC, 2 replies.
- Performance questions regarding Spark 1.3 standalone mode - posted by Khaled Ammar <kh...@gmail.com> on 2015/07/24 15:35:29 UTC, 1 replies.
- getting Error while Running SparkPi program - posted by Jeetendra Gangele <ga...@gmail.com> on 2015/07/24 15:43:22 UTC, 0 replies.
- Broadcast HashMap much slower than Array - posted by huanglr <hu...@CeBiTec.Uni-Bielefeld.DE> on 2015/07/24 16:12:50 UTC, 0 replies.
- Spark: configuration file 'metrics.properties' - posted by allonsy <lu...@gmail.com> on 2015/07/24 17:25:09 UTC, 0 replies.
- New consumer - offset one gets in poll is not offset one is supposed to commit - posted by Stevo Slavić <ss...@gmail.com> on 2015/07/24 18:13:27 UTC, 4 replies.
- Programmatically launch several hundred Spark Streams in parallel - posted by Brandon White <bw...@gmail.com> on 2015/07/24 18:23:19 UTC, 2 replies.
- Fwd: want to contribute to apache spark - posted by shashank kapoor <sh...@gmail.com> on 2015/07/24 19:48:06 UTC, 2 replies.
- How to maintain Windows of data along with maintaining session state using UpdateStateByKey - posted by swetha <sw...@gmail.com> on 2015/07/24 20:35:08 UTC, 0 replies.
- 50% performance decrease when using local file vs hdfs - posted by Tom Hubregtsen <th...@gmail.com> on 2015/07/24 20:45:12 UTC, 0 replies.
- spark classpath issue duplicate jar with diff versions - posted by Shushant Arora <sh...@gmail.com> on 2015/07/24 22:02:06 UTC, 1 replies.
- Parquet writing gets progressively slower - posted by Michael Kelly <mi...@gmail.com> on 2015/07/24 22:15:32 UTC, 3 replies.
- Stop condition Spark reading from Kafka with ReliableKafkaReceiver - posted by "castlv@163.com" <ca...@163.com> on 2015/07/25 02:59:53 UTC, 0 replies.
- Small-cluster deployment modes - posted by Edmon Begoli <eb...@gmail.com> on 2015/07/25 04:00:58 UTC, 0 replies.
- ReceiverStream SPARK not able to cope up with 20,000 events /sec . - posted by anshu shukla <an...@gmail.com> on 2015/07/25 11:59:08 UTC, 1 replies.
- Best practice for transforming and storing from Spark to Mongo/HDFS - posted by ni...@free.fr on 2015/07/25 16:14:08 UTC, 1 replies.
- Re: Insert data into a table - posted by sim <si...@swoop.com> on 2015/07/25 17:38:39 UTC, 0 replies.
- [Spark + Hive + EMR + S3] Issue when reading from Hive external table backed on S3 with large amount of small files - posted by Roberto Coluccio <ro...@gmail.com> on 2015/07/25 18:28:40 UTC, 0 replies.
- spark-dataflow + Spark Streaming + Kafka - posted by Albert Strasheim <al...@cloudflare.com> on 2015/07/25 19:01:02 UTC, 0 replies.
- Parallelism of Custom receiver in spark - posted by anshu shukla <an...@gmail.com> on 2015/07/25 19:43:26 UTC, 1 replies.
- Download Apache Spark on Windows 7 for a Proof of Concept installation - posted by Peter Leventis <pl...@telkomsa.net> on 2015/07/25 20:10:23 UTC, 2 replies.
- Question abt serialization - posted by tog <gu...@gmail.com> on 2015/07/25 21:13:08 UTC, 2 replies.
- Multiple operations on same DStream in Spark Streaming - posted by foobar <he...@fb.com> on 2015/07/26 00:07:06 UTC, 2 replies.
- Log For in[put rate value in streaming statistics - posted by anshu shukla <an...@gmail.com> on 2015/07/26 00:34:57 UTC, 0 replies.
- Twitter streaming with apache spark stream only a small amount of tweets - posted by Zoran Jeremic <zo...@gmail.com> on 2015/07/26 05:44:36 UTC, 6 replies.
- Spark is much slower than direct access MySQL - posted by Louis Hust <lo...@gmail.com> on 2015/07/26 09:47:37 UTC, 7 replies.
- Long running streaming application - worker death - posted by aviemzur <av...@gmail.com> on 2015/07/26 15:29:23 UTC, 2 replies.
- Schema evolution in tables - posted by sim <si...@swoop.com> on 2015/07/26 18:31:36 UTC, 0 replies.
- spark as a lookup engine for dedup - posted by Shushant Arora <sh...@gmail.com> on 2015/07/26 18:37:31 UTC, 3 replies.
- 回复： Asked to remove non-existent executor exception - posted by Sea <26...@qq.com> on 2015/07/26 18:57:54 UTC, 0 replies.
- Spark - Cassandra (timestamp question) - posted by Ivan Babic <ba...@gmail.com> on 2015/07/26 19:50:57 UTC, 0 replies.
- Writing streaming data to cassandra creates duplicates - posted by Priya Ch <le...@gmail.com> on 2015/07/26 20:19:54 UTC, 4 replies.
- RDD[Future[T]] => Future[RDD[T]] - posted by Ayoub <be...@gmail.com> on 2015/07/26 22:21:51 UTC, 4 replies.
- Custom partitioner - posted by Hafiz Mujadid <ha...@gmail.com> on 2015/07/26 22:43:57 UTC, 1 replies.
- PYSPARK_DRIVER_PYTHON="ipython" spark/bin/pyspark Does not create SparkContext - posted by Zerony Zhao <bw...@gmail.com> on 2015/07/27 01:06:00 UTC, 2 replies.
- [SparkScore]Performance portal for Apache Spark - WW30 - posted by "Huang, Jie" <ji...@intel.com> on 2015/07/27 05:10:27 UTC, 0 replies.
- unserialize error in sparkR - posted by Jennifer15 <bs...@purdue.edu> on 2015/07/27 07:47:24 UTC, 1 replies.
- Functions in Spark SQL - posted by vinod kumar <vi...@gmail.com> on 2015/07/27 09:04:14 UTC, 2 replies.
- hive.contrib.serde2.RegexSerDe not found - posted by ZhuGe <tc...@outlook.com> on 2015/07/27 09:35:26 UTC, 1 replies.
- spark spark-ec2 credentials using aws_security_token - posted by ja...@centrum.cz on 2015/07/27 09:43:13 UTC, 2 replies.
- Why the length of each task varies - posted by Gavin Liu <il...@gmail.com> on 2015/07/27 11:29:01 UTC, 1 replies.
- Spark - Serialization with Kryo - posted by Pa Rö <pa...@googlemail.com> on 2015/07/27 13:18:56 UTC, 0 replies.
- Data from PostgreSQL to Spark - posted by Jeetendra Gangele <ga...@gmail.com> on 2015/07/27 15:00:47 UTC, 10 replies.
- java.lang.ArrayIndexOutOfBoundsException: 0 on Yarn Client - posted by Manohar753 <ma...@happiestminds.com> on 2015/07/27 16:54:17 UTC, 6 replies.
- SparkR - posted by Mohit Anchlia <mo...@gmail.com> on 2015/07/27 19:07:30 UTC, 1 replies.
- Getting java.net.BindException when attempting to start Spark master on EC2 node with public IP - posted by Wayne Song <wa...@gmail.com> on 2015/07/27 19:07:50 UTC, 4 replies.
- Unexpected performance issues with Spark SQL using Parquet - posted by Jerry Lam <ch...@gmail.com> on 2015/07/27 20:37:17 UTC, 1 replies.
- Spark build/sbt assembly - posted by Rahul Palamuttam <ra...@gmail.com> on 2015/07/27 20:38:29 UTC, 5 replies.
- CPU Parallelization not being used (local mode) - posted by Sa...@wellsfargo.com on 2015/07/27 21:57:27 UTC, 1 replies.
- pyspark issue - posted by Naveen Madhire <vm...@umail.iu.edu> on 2015/07/27 23:40:46 UTC, 1 replies.
- Spree: a live-updating web UI for Spark - posted by Ryan Williams <ry...@gmail.com> on 2015/07/28 00:14:31 UTC, 0 replies.
- Do I really need to build Spark for Hive/Thrift Server support? - posted by ReeceRobinson <Re...@TheRobinsons.gen.nz> on 2015/07/28 00:56:28 UTC, 1 replies.
- Weird error using absolute path to run pyspark when using ipython driver - posted by Zerony Zhao <bw...@gmail.com> on 2015/07/28 01:49:35 UTC, 0 replies.
- Controlling output fileSize in SparkSQL - posted by Tim Smith <se...@gmail.com> on 2015/07/28 02:42:00 UTC, 0 replies.
- Spark SQL Error - posted by An Tran <tr...@gmail.com> on 2015/07/28 02:44:44 UTC, 1 replies.
- Json parsing library for Spark Streaming? - posted by swetha <sw...@gmail.com> on 2015/07/28 03:07:08 UTC, 1 replies.
- streaming issue - posted by "guoqing0629@yahoo.com.hk" <gu...@yahoo.com.hk> on 2015/07/28 04:48:50 UTC, 1 replies.
- Spark on Mesos - Shut down failed while running spark-shell - posted by Haripriya Ayyalasomayajula <ah...@gmail.com> on 2015/07/28 05:32:54 UTC, 1 replies.
- GenericRowWithSchema is too heavy - posted by Kevin Jung <it...@samsung.com> on 2015/07/28 06:02:01 UTC, 1 replies.
- NO Cygwin Support in bin/spark-class in Spark 1.4.0 - posted by Proust GZ Feng <pf...@cn.ibm.com> on 2015/07/28 06:19:30 UTC, 10 replies.
- Create StructType column in data frame - posted by Raghavendra Pandey <ra...@gmail.com> on 2015/07/28 06:34:47 UTC, 0 replies.
- Which directory contains third party libraries for Spark - posted by Stephen Boesch <ja...@gmail.com> on 2015/07/28 07:22:14 UTC, 3 replies.
- log file directory - posted by Jack Yang <ji...@uow.edu.au> on 2015/07/28 08:28:43 UTC, 1 replies.
- Re: Failed stages and dropped executors when running implicit matrix factorization/ALS : Too many values to unpack - posted by Xiangrui Meng <me...@gmail.com> on 2015/07/28 08:38:49 UTC, 0 replies.
- Heatmap with Spark Streaming - posted by UMESH CHAUDHARY <um...@gmail.com> on 2015/07/28 08:48:13 UTC, 6 replies.
- Spark Number of Partitions Recommendations - posted by Rahul Palamuttam <ra...@gmail.com> on 2015/07/28 09:42:29 UTC, 2 replies.
- Spark-Cassandra connector DataFrame - posted by simon wang <xw...@yahoo.com.INVALID> on 2015/07/28 09:47:45 UTC, 0 replies.
- A question about spark checkpoint - posted by "bit1129@163.com" <bi...@163.com> on 2015/07/28 09:54:55 UTC, 0 replies.
- pyspark/py4j tree error - posted by Dirk Nachbar <di...@gmail.com> on 2015/07/28 10:04:04 UTC, 0 replies.
- Checkpoints in SparkStreaming - posted by Guillermo Ortiz <ko...@gmail.com> on 2015/07/28 10:14:15 UTC, 1 replies.
- java.io.IOException: failure to login - posted by glen <gl...@openet.com> on 2015/07/28 10:18:43 UTC, 0 replies.
- Spark SQL ArrayOutofBoundsException Question - posted by tranan <tr...@gmail.com> on 2015/07/28 11:56:38 UTC, 0 replies.
- Messages are not stored for actorStream when using RoundRobinRouter - posted by Juan Rodríguez Hortalá <ju...@gmail.com> on 2015/07/28 13:12:38 UTC, 0 replies.
- Iterating over values by Key - posted by gulyasm <mg...@gmail.com> on 2015/07/28 13:15:01 UTC, 0 replies.
- Clustetr setup for SPARK standalone application: - posted by Sagar <sa...@yonsei.ac.kr> on 2015/07/28 14:01:54 UTC, 1 replies.
- Re: *Metrics API is odd in MLLib - posted by Sam <sa...@gmail.com> on 2015/07/28 14:44:20 UTC, 0 replies.
- spark streaming get kafka individual message's offset and partition no - posted by Shushant Arora <sh...@gmail.com> on 2015/07/28 14:48:17 UTC, 2 replies.
- Checkpoint issue in spark streaming - posted by Sadaf <sa...@platalytics.com> on 2015/07/28 15:59:26 UTC, 0 replies.
- PySpark MLlib Numpy Dependency - posted by "Eskilson,Aleksander" <Al...@Cerner.com> on 2015/07/28 16:34:56 UTC, 0 replies.
- projection optimization? - posted by Eric Friedman <er...@gmail.com> on 2015/07/28 16:47:03 UTC, 0 replies.
- sc.parallelise to work more like a producer/consumer? - posted by Kostas Kougios <ko...@googlemail.com> on 2015/07/28 16:58:22 UTC, 1 replies.
- spark-csv number of partitions - posted by Srikanth <sr...@gmail.com> on 2015/07/28 17:40:32 UTC, 0 replies.
- Generalised Spark-HBase integration - posted by Michal Haris <mi...@visualdna.com> on 2015/07/28 17:59:45 UTC, 4 replies.
- Actor not found for: ActorSelection - posted by Haseeb <11...@seecs.edu.pk> on 2015/07/28 19:56:30 UTC, 1 replies.
- Fighting against performance: JDBC RDD badly distributed - posted by Sa...@wellsfargo.com on 2015/07/28 20:41:16 UTC, 3 replies.
- DataFrame DAG recomputed even though DataFrame is cached? - posted by Kristina Rogale Plazonic <kp...@gmail.com> on 2015/07/28 20:50:25 UTC, 1 replies.
- restart from last successful stage - posted by Alex Nastetsky <al...@vervemobile.com> on 2015/07/28 22:03:32 UTC, 5 replies.
- [Spark ML] HasInputCol, etc. - posted by Matt Narrell <ma...@gmail.com> on 2015/07/28 22:32:27 UTC, 1 replies.
- Has anybody ever tried running Spark Streaming on 500 text streams? - posted by Brandon White <bw...@gmail.com> on 2015/07/29 00:06:07 UTC, 7 replies.
- broadcast variable question - posted by Jonathan Coveney <jc...@gmail.com> on 2015/07/29 01:23:45 UTC, 2 replies.
- Re: Spark Streaming Json file groupby function - posted by swetha <sw...@gmail.com> on 2015/07/29 01:37:15 UTC, 1 replies.
- unsubscribe - posted by Harshvardhan Chauhan <ha...@gumgum.com> on 2015/07/29 02:03:41 UTC, 4 replies.
- Re: Getting the number of slaves - posted by amkcom <am...@gmail.com> on 2015/07/29 02:18:17 UTC, 1 replies.
- Spark and Speech Recognition - posted by Peter Wolf <op...@gmail.com> on 2015/07/29 03:20:38 UTC, 3 replies.
- Job hang when running random forest - posted by Andy Zhao <an...@gmail.com> on 2015/07/29 04:25:46 UTC, 2 replies.
- Authentication Support with spark-submit cluster mode - posted by Anh Hong <ho...@yahoo.com.INVALID> on 2015/07/29 05:51:20 UTC, 2 replies.
- SparkR does not include SparkContext - posted by Siegfried Bilstein <sb...@gmail.com> on 2015/07/29 05:57:09 UTC, 0 replies.
- Does spark-submit support file transfering from local to cluster? - posted by Anh Hong <ho...@yahoo.com.INVALID> on 2015/07/29 06:02:54 UTC, 2 replies.
- Re: RDDs join problem: incorrect result - posted by ponkin <al...@ya.ru> on 2015/07/29 06:54:14 UTC, 1 replies.
- Spark Interview Questions - posted by "Mishra, Abhishek" <Ab...@xerox.com> on 2015/07/29 08:02:12 UTC, 4 replies.
- Re: error in twitter streaming - posted by Sadaf <sa...@platalytics.com> on 2015/07/29 10:17:39 UTC, 0 replies.
- Spark Streaming - posted by Sadaf <sa...@platalytics.com> on 2015/07/29 10:54:58 UTC, 1 replies.
- PermGen Space Error - posted by Sarath Chandra <sa...@algofusiontech.com> on 2015/07/29 11:09:18 UTC, 4 replies.
- Lambda serialization - posted by Subshiri S <su...@gmail.com> on 2015/07/29 12:05:49 UTC, 0 replies.
- Spark WebUI link problem in Mesos Master - posted by Anton Kirillov <an...@gmail.com> on 2015/07/29 13:27:42 UTC, 2 replies.
- Exception while submit spark job through yarn client - posted by ankit tyagi <an...@gmail.com> on 2015/07/29 14:20:45 UTC, 0 replies.
- FW: Executing spark code in Zeppelin - posted by Stefan Panayotov <sp...@msn.com> on 2015/07/29 14:21:29 UTC, 0 replies.
- Re: Executing spark code in Zeppelin - posted by Silvio Fiorito <si...@granturing.com> on 2015/07/29 14:50:51 UTC, 1 replies.
- Simple Map Reduce taking lot of time - posted by Varadharajan Mukundan <sr...@gmail.com> on 2015/07/29 15:08:15 UTC, 0 replies.
- How to read a Json file with a specific format? - posted by SparknewUser <me...@gmail.com> on 2015/07/29 15:37:00 UTC, 4 replies.
- Count of distinct values in each column - posted by "Devi P.V" <de...@gmail.com> on 2015/07/29 15:38:15 UTC, 1 replies.
- Graceful shutdown for Spark Streaming - posted by Michal Čizmazia <mi...@gmail.com> on 2015/07/29 15:43:02 UTC, 4 replies.
- Anyone using Intel's spark-streamingsql project to execute SQL queries over Spark streaming ? - posted by "Sela, Amit" <AN...@paypal.com.INVALID> on 2015/07/29 16:27:21 UTC, 0 replies.
- Too many open files - posted by Sa...@wellsfargo.com on 2015/07/29 17:39:37 UTC, 4 replies.
- sc.parallelize(512k items) doesn't always use 64 executors - posted by Kostas Kougios <ko...@googlemail.com> on 2015/07/29 17:57:45 UTC, 2 replies.
- Error in starting Spark Streaming Context - posted by Sadaf <sa...@platalytics.com> on 2015/07/29 17:59:38 UTC, 0 replies.
- streamingContext.stop(true,true) doesn't end the job - posted by mike <mi...@gmail.com> on 2015/07/29 18:31:06 UTC, 0 replies.
- stopped SparkContext remaining active - posted by Andres Perez <an...@tresata.com> on 2015/07/29 18:38:07 UTC, 3 replies.
- HiveQL to SparkSQL - posted by Bigdata techguy <bi...@gmail.com> on 2015/07/29 18:49:15 UTC, 2 replies.
- Spark Streaming Kafka could not find leader offset for Set() - posted by unk1102 <um...@gmail.com> on 2015/07/29 19:38:55 UTC, 6 replies.
- IP2Location within spark jobs - posted by Filli Alem <Al...@ti8m.ch> on 2015/07/29 21:04:32 UTC, 1 replies.
- Difference between RandomForestModel and RandomForestClassificationModel - posted by praveen S <my...@gmail.com> on 2015/07/29 21:14:28 UTC, 1 replies.
- broadcast variable and accumulators issue while spark streaming checkpoint recovery - posted by Shushant Arora <sh...@gmail.com> on 2015/07/29 22:15:38 UTC, 3 replies.
- Passing SPARK_CONF_DIR to slaves in standalone mode under Grid Engine job - posted by David Chin <da...@drexel.edu> on 2015/07/29 23:39:53 UTC, 0 replies.
- NoClassDefFoundError: scala/collection/GenTraversableOnce$class - posted by Benjamin Ross <br...@Lattice-Engines.com> on 2015/07/30 02:14:27 UTC, 2 replies.
- How to set log level in spark-submit ? - posted by canan chen <cc...@gmail.com> on 2015/07/30 04:01:35 UTC, 4 replies.
- help plz! how to use zipWithIndex to each subset of a RDD - posted by askformore <as...@163.com> on 2015/07/30 04:13:09 UTC, 4 replies.
- Is it Spark Serialization bug ? - posted by Subshiri S <su...@gmail.com> on 2015/07/30 08:00:06 UTC, 0 replies.
- TFIDF Transformation - posted by zi...@accenture.com on 2015/07/30 08:38:07 UTC, 1 replies.
- Spark on YARN - posted by Jeetendra Gangele <ga...@gmail.com> on 2015/07/30 08:51:01 UTC, 4 replies.
- Re: Spark Streaming Cannot Work On Next Interval - posted by Himanshu Mehra <hi...@gmail.com> on 2015/07/30 09:42:03 UTC, 0 replies.
- Running Spark on user-provided Hadoop installation - posted by hermansc <he...@gmail.com> on 2015/07/30 10:48:19 UTC, 1 replies.
- Upgrade of Spark-Streaming application - posted by Nicola Ferraro <ni...@gmail.com> on 2015/07/30 11:07:25 UTC, 1 replies.
- How to perform basic statistics on a Json file to explore my numeric and non-numeric variables? - posted by SparknewUser <me...@gmail.com> on 2015/07/30 11:33:18 UTC, 0 replies.
- Twitter Connector-Spark Streaming - posted by Sadaf <sa...@platalytics.com> on 2015/07/30 12:49:27 UTC, 2 replies.
- Error SparkStreaming after a while executing. - posted by Guillermo Ortiz <ko...@gmail.com> on 2015/07/30 13:04:18 UTC, 2 replies.
- Problems with JobScheduler - posted by Guillermo Ortiz <ko...@gmail.com> on 2015/07/30 14:41:35 UTC, 10 replies.
- Python version collision - posted by Javier Domingo Cansino <ja...@gmail.com> on 2015/07/30 15:11:24 UTC, 0 replies.
- Re: Connection closed/reset by peers error - posted by firemonk9 <dh...@gmail.com> on 2015/07/30 15:53:21 UTC, 0 replies.
- Re: Lost task - connection closed - posted by firemonk9 <dh...@gmail.com> on 2015/07/30 16:14:35 UTC, 0 replies.
- Spark Master Build Git Commit Hash - posted by Jerry Lam <ch...@gmail.com> on 2015/07/30 16:39:14 UTC, 4 replies.
- How to register array class with Kyro in spark-defaults.conf - posted by "Wang, Ningjun (LNG-NPV)" <ni...@lexisnexis.com> on 2015/07/30 17:06:13 UTC, 3 replies.
- How to control Spark Executors from getting Lost when using YARN client mode? - posted by unk1102 <um...@gmail.com> on 2015/07/30 18:08:13 UTC, 3 replies.
- apache-spark 1.3.0 and yarn integration and spring-boot as a container - posted by Nirav Patel <np...@xactlycorp.com> on 2015/07/30 19:47:36 UTC, 1 replies.
- Spark SQL DataFrame: Nullable column and filtering - posted by martinibus77 <ma...@googlemail.com> on 2015/07/30 20:19:33 UTC, 5 replies.
- [Parquet + Dataframes] Column names with spaces - posted by angelini <al...@shopify.com> on 2015/07/30 20:49:08 UTC, 1 replies.
- Cast Error DataFrame/RDD doing group by and case class - posted by Rishabh Bhardwaj <rb...@gmail.com> on 2015/07/30 20:57:40 UTC, 0 replies.
- Failed to load class for data source: org.apache.spark.sql.cassandra - posted by Benjamin Ross <br...@Lattice-Engines.com> on 2015/07/30 21:45:13 UTC, 2 replies.
- Parquet SaveMode.Append Trouble. - posted by satyajit vegesna <sa...@gmail.com> on 2015/07/31 00:26:51 UTC, 0 replies.
- Does Spark Streaming need to list all the files in a directory? - posted by Brandon White <bw...@gmail.com> on 2015/07/31 00:55:10 UTC, 1 replies.
- How do i specify the data types in a DF - posted by afarahat <ay...@yahoo.com> on 2015/07/31 02:31:07 UTC, 0 replies.
- Problem submiting an script .py against an standalone cluster. - posted by fordfarline <fo...@gmail.com> on 2015/07/31 04:19:34 UTC, 2 replies.
- How RDD lineage works - posted by "bit1129@163.com" <bi...@163.com> on 2015/07/31 04:39:48 UTC, 6 replies.
- Losing files in hdfs after creating spark sql table - posted by Ron Gonzalez <zl...@yahoo.com.INVALID> on 2015/07/31 05:57:49 UTC, 0 replies.
- Spark-Submit error - posted by satish chandra j <js...@gmail.com> on 2015/07/31 08:06:41 UTC, 0 replies.
- Encryption on RDDs or in-memory/cache on Apache Spark - posted by Matthew O'Reilly <mo...@qub.ac.uk> on 2015/07/31 10:17:27 UTC, 0 replies.
- SparkLauncher not notified about finished job - hangs infinitely. - posted by Tomasz Guziałek <To...@HumanInference.com> on 2015/07/31 11:45:19 UTC, 4 replies.
- looking for helps in using graphx aggregateMessages - posted by man june <zh...@yahoo.com.INVALID> on 2015/07/31 12:27:09 UTC, 0 replies.
- Buffer Overflow exception - posted by vinod kumar <vi...@gmail.com> on 2015/07/31 13:02:32 UTC, 0 replies.
- Re: Big Integer number in Spark - posted by ssingal05 <ss...@gmail.com> on 2015/07/31 16:18:42 UTC, 0 replies.
- Re: setting fs.umask in pyspark - posted by aesilberstein <ad...@trifacta.com> on 2015/07/31 17:34:03 UTC, 0 replies.
- Issues with JavaRDD.subtract(JavaRDD) method in local vs. cluster mode - posted by Warfish <se...@gmail.com> on 2015/07/31 18:01:22 UTC, 2 replies.
- [POWERED BY] Please add Typesafe to the list of organizations - posted by Dean Wampler <de...@gmail.com> on 2015/07/31 18:29:34 UTC, 0 replies.
- Setting a stage timeout - posted by William Kinney <wi...@gmail.com> on 2015/07/31 19:01:42 UTC, 1 replies.
- Checkpointing doesn't appear to be working for direct streaming from Kafka - posted by Dmitry Goldenberg <dg...@gmail.com> on 2015/07/31 19:16:03 UTC, 4 replies.
- Record Linkage in Spark - posted by dihash <sh...@icloud.com> on 2015/07/31 20:05:25 UTC, 0 replies.
- How to create Spark DataFrame using custom Hadoop InputFormat? - posted by unk1102 <um...@gmail.com> on 2015/07/31 20:24:51 UTC, 5 replies.
- How to add multiple sequence files from HDFS to a Spark Context to do Batch processing? - posted by swetha <sw...@gmail.com> on 2015/07/31 20:35:29 UTC, 1 replies.
- How to increase parallelism of a Spark cluster? - posted by Sujit Pal <su...@gmail.com> on 2015/07/31 22:03:12 UTC, 0 replies.
- What happens when you create more DStreams then nodes in the cluster? - posted by Brandon White <bw...@gmail.com> on 2015/07/31 22:52:44 UTC, 1 replies.
- how to convert a sequence of TimeStamp to a dataframe - posted by Joanne Contact <jo...@gmail.com> on 2015/07/31 23:50:24 UTC, 0 replies.