You are viewing a plain text version of this content. The canonical link for it is here.
- Re: Kafka Direct does not recover automatically when the Kafka Stream gets messed up? - posted by swetha kasireddy <sw...@gmail.com> on 2015/12/01 00:13:05 UTC, 0 replies.
- Re: Spark 1.5.2 + Hive 1.0.0 in Amazon EMR 4.2.0 - posted by Ted Yu <yu...@gmail.com> on 2015/12/01 00:51:47 UTC, 0 replies.
- Recovery for Spark Streaming Kafka Direct in case of issues with Kafka - posted by SRK <sw...@gmail.com> on 2015/12/01 01:22:56 UTC, 13 replies.
- Re: Grid search with Random Forest - posted by Joseph Bradley <jo...@databricks.com> on 2015/12/01 01:34:02 UTC, 6 replies.
- capture video with spark streaming - posted by Lavallen Pablo <in...@yahoo.com.ar> on 2015/12/01 02:06:34 UTC, 1 replies.
- Re: dfs.blocksize is not applicable to some cases - posted by Jung <jb...@naver.com> on 2015/12/01 02:22:25 UTC, 1 replies.
- Spark Streaming Specify Kafka Partition - posted by Alan Braithwaite <al...@cloudflare.com> on 2015/12/01 02:43:22 UTC, 5 replies.
- spark rdd grouping - posted by Rajat Kumar <ra...@gmail.com> on 2015/12/01 02:46:10 UTC, 4 replies.
- ORC predicate pushdown in HQL - posted by Tejas Patil <te...@gmail.com> on 2015/12/01 02:56:26 UTC, 1 replies.
- SparkPi running slower with more cores on each worker - posted by yiskylee <yi...@gmail.com> on 2015/12/01 05:01:45 UTC, 0 replies.
- Spark streaming job hangs - posted by Cassa L <lc...@gmail.com> on 2015/12/01 07:13:37 UTC, 2 replies.
- Re: question about combining small parquet files - posted by Sabarish Sasidharan <sa...@manthan.com> on 2015/12/01 07:32:04 UTC, 0 replies.
- Re: Cant start master on windows 7 - posted by Jacek Laskowski <ja...@japila.pl> on 2015/12/01 09:33:30 UTC, 0 replies.
- Re: Help with type check - posted by Eyal Sharon <ey...@scene53.com> on 2015/12/01 09:34:31 UTC, 0 replies.
- merge 3 different types of RDDs in one - posted by Shams ul Haque <sh...@cashcare.in> on 2015/12/01 10:47:20 UTC, 6 replies.
- New to Spark - posted by Ashok Kumar <as...@yahoo.com.INVALID> on 2015/12/01 10:54:27 UTC, 3 replies.
- Re: Failing to execute Pregel shortest path on 22k nodes - posted by Robineast <Ro...@xense.co.uk> on 2015/12/01 11:30:06 UTC, 0 replies.
- Unable to get phoenix connection in spark job in secured cluster - posted by Akhilesh Pathodia <pa...@gmail.com> on 2015/12/01 13:15:44 UTC, 4 replies.
- Turning off DTD Validation using XML Utils package - Spark - posted by Shivalik <sh...@outlook.com> on 2015/12/01 14:15:24 UTC, 2 replies.
- Scala 2.11 and Akka 2.4.0 - posted by RodrigoB <ro...@aspect.com> on 2015/12/01 14:32:25 UTC, 13 replies.
- diff between apps and waitingApps? - posted by Romi Kuntsman <ro...@totango.com> on 2015/12/01 14:49:59 UTC, 0 replies.
- Send JsonDocument to Couchbase - posted by Eyal Sharon <ey...@scene53.com> on 2015/12/01 15:12:06 UTC, 1 replies.
- Effective ways monitor and identify that a Streaming job has been failing for the last 5 minutes - posted by SRK <sw...@gmail.com> on 2015/12/01 16:45:00 UTC, 2 replies.
- Re: Spark and simulated annealing - posted by marfago <ma...@inwind.it> on 2015/12/01 16:52:52 UTC, 0 replies.
- Re: spark-ec2 vs. EMR - posted by Nick Chammas <ni...@gmail.com> on 2015/12/01 17:15:50 UTC, 7 replies.
- Migrate a cassandra table among from one cluster to another - posted by George Sigletos <si...@textkernel.nl> on 2015/12/01 17:20:59 UTC, 0 replies.
- spark streaming count msg in batch - posted by patcharee <Pa...@uni.no> on 2015/12/01 18:32:57 UTC, 1 replies.
- Getting all files of a table - posted by Krzysztof Zarzycki <k....@gmail.com> on 2015/12/01 19:55:48 UTC, 2 replies.
- Low Latency SQL query - posted by Andrés Ivaldi <ia...@gmail.com> on 2015/12/01 20:51:37 UTC, 14 replies.
- ClassLoader resources on executor - posted by Charles Allen <ch...@metamarkets.com> on 2015/12/01 21:45:28 UTC, 4 replies.
- Graph testing question - posted by Nathan Kronenfeld <nk...@uncharted.software> on 2015/12/01 22:27:08 UTC, 0 replies.
- Driver Hangs before starting Job - posted by Patrick Brown <pa...@gmail.com> on 2015/12/01 22:55:31 UTC, 0 replies.
- Spark DIMSUM Memory requirement? - posted by Parin Choganwala <pa...@7parkdata.com> on 2015/12/01 22:56:15 UTC, 0 replies.
- Question about yarn-cluster mode and spark.driver.allowMultipleContexts - posted by Anfernee Xu <an...@gmail.com> on 2015/12/02 00:32:11 UTC, 7 replies.
- Master is listing DEAD slaves, can they be cleaned up? - posted by Dillian Murphey <cr...@gmail.com> on 2015/12/02 01:12:55 UTC, 0 replies.
- how to use spark.mesos.constraints - posted by rarediel <br...@gettyimages.com> on 2015/12/02 02:14:42 UTC, 0 replies.
- SparkSQL API to insert DataFrame into a static partition? - posted by Isabelle Phan <nl...@gmail.com> on 2015/12/02 03:50:25 UTC, 6 replies.
- Can Spark Execute Hive Update/Delete operations - posted by 张炜 <zh...@gmail.com> on 2015/12/02 04:08:06 UTC, 4 replies.
- Re: Spark Expand Cluster - posted by Alexander Pivovarov <ap...@gmail.com> on 2015/12/02 04:59:11 UTC, 0 replies.
- Spark Streaming - History UI - posted by patcharee <Pa...@uni.no> on 2015/12/02 06:28:42 UTC, 2 replies.
- Increasing memory usage on batch job (pyspark) - posted by Aaron Jackson <aj...@pobox.com> on 2015/12/02 06:46:18 UTC, 0 replies.
- Retrieving the PCA parameters in pyspark - posted by Rohit Girdhar <ro...@gmail.com> on 2015/12/02 06:49:12 UTC, 2 replies.
- Re: General question on using StringIndexer in SparkML - posted by Vishnu Viswanath <vi...@gmail.com> on 2015/12/02 07:31:36 UTC, 4 replies.
- Graphx: How to print the group of connected components one by one - posted by "Zhang, Jingyu" <ji...@news.com.au> on 2015/12/02 07:53:00 UTC, 0 replies.
- Re: Spark on Mesos with Centos 6.6 NFS - posted by Akhil Das <ak...@sigmoidanalytics.com> on 2015/12/02 08:18:25 UTC, 0 replies.
- Re: what is algorithm to optimize function with nonlinear constraints - posted by Zhiliang Zhu <zc...@yahoo.com.INVALID> on 2015/12/02 08:54:42 UTC, 0 replies.
- sparkSQL Load multiple tables - posted by censj <ce...@lotuseed.com> on 2015/12/02 09:06:19 UTC, 5 replies.
- Error when submitting job to yarn cluster - posted by cs user <ac...@gmail.com> on 2015/12/02 11:13:47 UTC, 1 replies.
- Re: [POWERED BY] Please add our organization - posted by Adrien Mogenet <ad...@contentsquare.com> on 2015/12/02 11:53:53 UTC, 2 replies.
- Sharing object/state accross transformations - posted by JayKay <ju...@gmail.com> on 2015/12/02 11:54:40 UTC, 4 replies.
- Jupyter configuration - posted by Roberto Pagliari <ro...@asos.com> on 2015/12/02 13:11:14 UTC, 1 replies.
- create DataFrame from RDD - posted by Zsolt Tóth <to...@gmail.com> on 2015/12/02 15:29:49 UTC, 0 replies.
- default parallelism and mesos executors - posted by Adrian Bridgett <ad...@opensignal.com> on 2015/12/02 15:58:03 UTC, 3 replies.
- Re: Spark Streaming and JMS - posted by SamyaMaiti <sa...@gmail.com> on 2015/12/02 17:36:38 UTC, 0 replies.
- Spark Streaming Use Cases - posted by Priya Ch <le...@gmail.com> on 2015/12/02 19:17:50 UTC, 0 replies.
- Re: possible bug spark/python/pyspark/rdd.py portable_hash() - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2015/12/02 19:32:20 UTC, 0 replies.
- Spark Submit statement execution via shell script within cluster or SSH to cluster node - posted by Db-Blog <mp...@gmail.com> on 2015/12/02 20:29:02 UTC, 0 replies.
- Strategy for large amount of small tasks - posted by Sa...@wellsfargo.com on 2015/12/02 21:03:04 UTC, 0 replies.
- Re: Kafka - streaming from multiple topics - posted by dutrow <da...@gmail.com> on 2015/12/02 21:08:05 UTC, 16 replies.
- ML - LinearRegression: is this a bug ???? - posted by Sa...@wellsfargo.com on 2015/12/02 21:50:57 UTC, 1 replies.
- SparkStreaming: Updating internal variables without catching events from Kafka - posted by AliGouta <al...@gmail.com> on 2015/12/02 22:22:09 UTC, 0 replies.
- RE: starting spark-shell throws /tmp/hive on HDFS should be writable error - posted by "Lin, Hao" <Ha...@finra.org> on 2015/12/02 22:48:55 UTC, 2 replies.
- Re: df.partitionBy().parquet() java.lang.OutOfMemoryError: GC overhead limit exceeded - posted by Don Drake <do...@gmail.com> on 2015/12/02 23:11:16 UTC, 3 replies.
- Spark Streaming 1.6 accumulating state across batches for joins - posted by Aris <ar...@gmail.com> on 2015/12/02 23:12:41 UTC, 1 replies.
- newbie how to upgrade a spark-ec2 cluster? - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2015/12/02 23:39:48 UTC, 5 replies.
- Improve saveAsTextFile performance - posted by Ram VISWANADHA <ra...@dailymotion.com> on 2015/12/03 00:15:41 UTC, 8 replies.
- Spark Streaming from S3 - posted by Michele Freschi <mf...@palantir.com> on 2015/12/03 01:42:38 UTC, 3 replies.
- Pyspark submitted app just hangs - posted by Darren Govoni <da...@ontrenet.com> on 2015/12/03 02:48:41 UTC, 3 replies.
- how to skip headers when reading multiple files - posted by Divya Gehlot <di...@gmail.com> on 2015/12/03 03:52:24 UTC, 2 replies.
- How the cores are used in Directstream approach - posted by Charan Ganga Phani Adabala <ch...@eiqnetworks.com> on 2015/12/03 05:12:31 UTC, 0 replies.
- LDA topic modeling and Spark - posted by "Nguyen, Tiffany T" <ng...@grinnell.edu> on 2015/12/03 06:07:45 UTC, 1 replies.
- spark sql cli query results written to file ? - posted by "fightfate@163.com" <fi...@163.com> on 2015/12/03 07:05:59 UTC, 5 replies.
- Re: send this email to unsubscribe - posted by Akhil Das <ak...@sigmoidanalytics.com> on 2015/12/03 07:13:10 UTC, 0 replies.
- Re: Multiplication on decimals in a dataframe query - posted by Akhil Das <ak...@sigmoidanalytics.com> on 2015/12/03 07:32:25 UTC, 4 replies.
- Re: Debug Spark - posted by Akhil Das <ak...@sigmoidanalytics.com> on 2015/12/03 07:46:57 UTC, 2 replies.
- Re: Error in block pushing thread puts the KinesisReceiver in a stuck state - posted by Akhil Das <ak...@sigmoidanalytics.com> on 2015/12/03 07:51:12 UTC, 0 replies.
- How to test https://issues.apache.org/jira/browse/SPARK-10648 fix - posted by Madabhattula Rajesh Kumar <mr...@gmail.com> on 2015/12/03 09:39:39 UTC, 3 replies.
- Re: Checkpointing not removing shuffle files from local disk - posted by Ewan Higgs <ew...@ugent.be> on 2015/12/03 11:24:54 UTC, 0 replies.
- Building spark 1.3 from source code to work with Hive 1.2.1 - posted by Mich Talebzadeh <mi...@peridale.co.uk> on 2015/12/03 11:28:55 UTC, 2 replies.
- spark1.4.1 extremely slow for take(1) or head() or first() or show - posted by hxw黄祥为 <hu...@Ctrip.com> on 2015/12/03 11:29:04 UTC, 1 replies.
- Re: spark1.4.1 extremely slow for take(1) or head() or first() or show - posted by Sahil Sareen <sa...@gmail.com> on 2015/12/03 11:39:12 UTC, 0 replies.
- Column Aliases are Ignored in callUDF while using struct() - posted by Sachin Aggarwal <di...@gmail.com> on 2015/12/03 11:43:25 UTC, 2 replies.
- How and where to update release notes for spark rel 1.6? - posted by RaviShankar KS <em...@gmail.com> on 2015/12/03 12:01:28 UTC, 2 replies.
- Python API Documentation Mismatch - posted by Roberto Pagliari <ro...@asos.com> on 2015/12/03 12:03:35 UTC, 3 replies.
- Why does Spark job stucks and waits for only last tasks to get finished - posted by unk1102 <um...@gmail.com> on 2015/12/03 15:16:07 UTC, 0 replies.
- Sparse Vector ArrayIndexOutOfBoundsException - posted by nabegh <na...@gmail.com> on 2015/12/03 15:30:07 UTC, 1 replies.
- Re: How the cores are used in Directstream approach - posted by Cody Koeninger <co...@koeninger.org> on 2015/12/03 16:37:59 UTC, 1 replies.
- understanding and disambiguating CPU-core related properties - posted by Manolis Sifalakis1 <EM...@zurich.ibm.com> on 2015/12/03 16:44:17 UTC, 2 replies.
- Spark Streaming BackPressure and Custom Receivers - posted by Deenar Toraskar <de...@gmail.com> on 2015/12/03 17:04:55 UTC, 0 replies.
- Local mode: Stages hang for minutes - posted by Richard Marscher <rm...@localytics.com> on 2015/12/03 17:24:31 UTC, 3 replies.
- AWS CLI --jars comma problem - posted by Yusuf Can Gürkan <yu...@useinsider.com> on 2015/12/03 17:51:42 UTC, 1 replies.
- Problem with RDD of (Long, Byte[Array]) - posted by Hervé Yviquel <el...@gmail.com> on 2015/12/03 17:58:47 UTC, 2 replies.
- Creating a dataframe with decimals changes the precision and scale - posted by Philip Dodds <ph...@gmail.com> on 2015/12/03 18:46:52 UTC, 1 replies.
- Any clue on this error, Exception in thread "main" java.lang.NoSuchFieldError: SPARK_RPC_CLIENT_CONNECT_TIMEOUT - posted by Mich Talebzadeh <mi...@peridale.co.uk> on 2015/12/03 18:54:08 UTC, 7 replies.
- how to spark streaming application start working on next batch before completing on previous batch . - posted by prateek arora <pr...@gmail.com> on 2015/12/03 19:19:00 UTC, 2 replies.
- Spark Streaming Running Out Of Memory in 1.5.0. - posted by Augustus Hong <au...@branchmetrics.io> on 2015/12/03 20:47:45 UTC, 1 replies.
- Does Spark streaming support iterative operator? - posted by Wang Yangjun <ya...@aalto.fi> on 2015/12/03 21:04:49 UTC, 2 replies.
- Spark java.lang.SecurityException: class “javax.servlet.FilterRegistration”' with sbt - posted by Moises Baly <mo...@spatially.com> on 2015/12/03 21:46:21 UTC, 0 replies.
- SparkStreaming is failing to process Kafka jobs under load.... - posted by Rachana Srivastava <Ra...@markmonitor.com> on 2015/12/03 22:42:28 UTC, 1 replies.
- Spark SQL - Reading HCatalog Table - posted by Sandip Mehta <sa...@gmail.com> on 2015/12/03 22:50:39 UTC, 0 replies.
- newbie best practices: is spark-ec2 intended to be used to manage long-lasting infrastructure ? - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2015/12/03 23:05:11 UTC, 4 replies.
- SparkR in Spark 1.5.2 jsonFile Bug Found - posted by tomasr3 <to...@transvoyant.com> on 2015/12/03 23:07:55 UTC, 2 replies.
- consumergroup not working - posted by Hudong Wang <ju...@hotmail.com> on 2015/12/03 23:41:06 UTC, 0 replies.
- sparkavro for PySpark 1.3 - posted by YaoPau <jo...@gmail.com> on 2015/12/04 00:28:55 UTC, 2 replies.
- jdbc error, ClassNotFoundException: org.apache.hadoop.hive.schshim.FairSchedulerShim - posted by zhangjp <59...@qq.com> on 2015/12/04 02:59:55 UTC, 0 replies.
- spark master - run-tests error - posted by "wei.zhu@kaiyuandao.com" <we...@kaiyuandao.com> on 2015/12/04 03:09:05 UTC, 4 replies.
- Spark Streaming Shuffle to Disk - posted by Steven Pearson <sp...@gmail.com> on 2015/12/04 03:54:53 UTC, 3 replies.
- One Problem about Spark Dynamic Allocation - posted by 谢廷稳 <xi...@gmail.com> on 2015/12/04 06:55:37 UTC, 0 replies.
- Not able to receive data in spark from rsyslog - posted by masoom alam <ma...@wanclouds.net> on 2015/12/04 08:30:21 UTC, 1 replies.
- Predictive Modeling - posted by Chintan Bhatt <ch...@charusat.ac.in> on 2015/12/04 10:32:34 UTC, 1 replies.
- Avoid Shuffling on Partitioned Data - posted by Yiannis Gkoufas <jo...@gmail.com> on 2015/12/04 10:53:35 UTC, 3 replies.
- Getting error when trying to start master node after building spark 1.3 - posted by Mich Talebzadeh <mi...@peridale.co.uk> on 2015/12/04 11:42:10 UTC, 2 replies.
- Questions about Spark Shuffle and Heap - posted by Jianneng Li <ji...@berkeley.edu> on 2015/12/04 13:01:33 UTC, 0 replies.
- Is it possible to pass additional parameters to a python function when used inside RDD.filter method? - posted by Abhishek Shivkumar <ab...@bigdatapartnership.com> on 2015/12/04 13:19:43 UTC, 2 replies.
- (Unknown) - posted by Sateesh Karuturi <sa...@gmail.com> on 2015/12/04 13:27:51 UTC, 0 replies.
- Spark applications metrics - posted by patcharee <Pa...@uni.no> on 2015/12/04 13:50:44 UTC, 1 replies.
- How to get the list of available Transformations and actions for a RDD in Spark-Shell - posted by Gokula Krishnan D <em...@gmail.com> on 2015/12/04 14:01:58 UTC, 4 replies.
- RDD functions - posted by Sateesh Karuturi <sa...@gmail.com> on 2015/12/04 14:21:58 UTC, 2 replies.
- Spark UI - Streaming Tab - posted by patcharee <Pa...@uni.no> on 2015/12/04 15:28:36 UTC, 3 replies.
- anyone who can help me out with thi error please - posted by Mich Talebzadeh <mi...@peridale.co.uk> on 2015/12/04 16:11:39 UTC, 0 replies.
- has someone seen this error please? - posted by Mich Talebzadeh <mi...@peridale.co.uk> on 2015/12/04 16:11:39 UTC, 0 replies.
- Fwd: Can't run Spark Streaming Kinesis example - posted by Brian London <br...@gmail.com> on 2015/12/04 18:10:48 UTC, 1 replies.
- How to access a RDD (that has been broadcasted) inside the filter method of another RDD? - posted by Abhishek Shivkumar <ab...@bigdatapartnership.com> on 2015/12/04 18:13:30 UTC, 2 replies.
- Regarding Join between two graphs - posted by hastimal <ja...@gmail.com> on 2015/12/04 18:28:13 UTC, 0 replies.
- is Multiple Spark Contexts is supported in spark 1.5.0 ? - posted by prateek arora <pr...@gmail.com> on 2015/12/04 18:46:04 UTC, 11 replies.
- ROW_TIMESTAMP support with UNSIGNED_LONG - posted by pierre lacave <pi...@lacave.me> on 2015/12/04 18:51:41 UTC, 1 replies.
- Any role for volunteering - posted by Deepak Sharma <de...@gmail.com> on 2015/12/04 18:58:05 UTC, 0 replies.
- Spark SQL IN Clause - posted by Madabhattula Rajesh Kumar <mr...@gmail.com> on 2015/12/04 19:14:52 UTC, 4 replies.
- Is Temporary Access Credential (AccessKeyId, SecretAccessKey + SecurityToken) support by Spark? - posted by "Lin, Hao" <Ha...@finra.org> on 2015/12/04 19:41:03 UTC, 2 replies.
- Oozie SparkAction not able to use spark conf values - posted by Rajadayalan Perumalsamy <ra...@gmail.com> on 2015/12/04 19:43:35 UTC, 1 replies.
- Not all workers seem to run in a standalone cluster setup by spark-ec2 script - posted by Kyohey Hamaguchi <tn...@gmail.com> on 2015/12/04 20:28:23 UTC, 4 replies.
- How to modularize Spark Streaming Jobs? - posted by SRK <sw...@gmail.com> on 2015/12/04 20:38:24 UTC, 0 replies.
- Re: JMXSink for YARN deployment - posted by spearson23 <sp...@gmail.com> on 2015/12/04 20:55:26 UTC, 1 replies.
- Higher Processing times in Spark Streaming with kafka Direct - posted by SRK <sw...@gmail.com> on 2015/12/04 22:21:06 UTC, 1 replies.
- RE: Broadcasting a parquet file using spark and python - posted by Shuai Zheng <sz...@gmail.com> on 2015/12/04 23:49:34 UTC, 2 replies.
- Exception in thread "main" java.lang.IncompatibleClassChangeError: - posted by Prem Sure <pr...@gmail.com> on 2015/12/04 23:52:36 UTC, 1 replies.
- Spark ML Random Forest output. - posted by Eugene Morozov <ev...@gmail.com> on 2015/12/04 23:56:47 UTC, 5 replies.
- spark.authenticate=true YARN mode doesn't work - posted by prasadreddy <al...@gmail.com> on 2015/12/05 02:47:39 UTC, 7 replies.
- the way to compare any two adjacent elements in one rdd - posted by Zhiliang Zhu <zc...@yahoo.com.INVALID> on 2015/12/05 07:30:56 UTC, 6 replies.
- MLlib training time question - posted by Haoyue Wang <wh...@gmail.com> on 2015/12/05 08:14:44 UTC, 2 replies.
- 8080 not working - posted by Chintan Bhatt <ch...@charusat.ac.in> on 2015/12/05 10:14:32 UTC, 1 replies.
- Dataset and lambas - posted by Koert Kuipers <ko...@tresata.com> on 2015/12/05 18:42:30 UTC, 7 replies.
- Testing with spark testing base - posted by Masf <ma...@gmail.com> on 2015/12/05 18:51:04 UTC, 1 replies.
- ClassCastException in Kryo Serialization - posted by SRK <sw...@gmail.com> on 2015/12/06 02:17:00 UTC, 0 replies.
- How to identify total schedule delay in a Streaming app using Ganglia? - posted by SRK <sw...@gmail.com> on 2015/12/06 02:20:33 UTC, 0 replies.
- How to debug Spark source using IntelliJ/ Eclipse - posted by jatinganhotra <ja...@gmail.com> on 2015/12/06 03:58:22 UTC, 0 replies.
- Possible bug in Spark 1.5.0 onwards while loading Postgres JDBC driver - posted by manasdebashiskar <po...@gmail.com> on 2015/12/06 06:37:12 UTC, 1 replies.
- Re: maven built the spark-1.5.2 source documents, but error happened - posted by manasdebashiskar <po...@gmail.com> on 2015/12/06 06:45:43 UTC, 0 replies.
- Re: how to judge a DStream is empty or not after filter operation, so that return a boolean reault - posted by manasdebashiskar <po...@gmail.com> on 2015/12/06 07:13:57 UTC, 0 replies.
- spark streaming in python: questions about countByValue and countByValueAndWindow - posted by krist333 <kr...@gmail.com> on 2015/12/06 07:17:01 UTC, 0 replies.
- Re: Spark Streaming - controlling Cached table registered in memory created from each RDD of a windowed stream - posted by manasdebashiskar <po...@gmail.com> on 2015/12/06 07:23:59 UTC, 0 replies.
- Re: Spark SQL doesn't support column names that contain '-','$'... - posted by manasdebashiskar <po...@gmail.com> on 2015/12/06 07:39:11 UTC, 0 replies.
- Re: Spark 1.5.2 getting stuck when reading from HDFS in YARN client mode - posted by manasdebashiskar <po...@gmail.com> on 2015/12/06 07:43:26 UTC, 0 replies.
- Parquet runs out of memory when reading in a huge matrix - posted by AlexG <sw...@gmail.com> on 2015/12/06 07:47:15 UTC, 0 replies.
- Re: Obtaining Job Id for query submitted via Spark Thrift Server - posted by manasdebashiskar <po...@gmail.com> on 2015/12/06 07:48:26 UTC, 2 replies.
- Re: partition RDD of images - posted by manasdebashiskar <po...@gmail.com> on 2015/12/06 07:52:44 UTC, 0 replies.
- Re: Experiences about NoSQL databases with Spark - posted by manasdebashiskar <po...@gmail.com> on 2015/12/06 08:07:31 UTC, 2 replies.
- Re: tmp directory - posted by manasdebashiskar <po...@gmail.com> on 2015/12/06 08:11:55 UTC, 0 replies.
- Re: Spark checkpointing - restrict checkpointing to local file system? - posted by manasdebashiskar <po...@gmail.com> on 2015/12/06 08:17:51 UTC, 0 replies.
- Re: [streaming] KafkaUtils.createDirectStream - how to start streming from checkpoints? - posted by manasdebashiskar <po...@gmail.com> on 2015/12/06 08:25:53 UTC, 1 replies.
- Spark SQL 1.3 not finding attribute in DF - posted by YaoPau <jo...@gmail.com> on 2015/12/06 09:15:04 UTC, 3 replies.
- Re: Spark on YARN: java.lang.ClassCastException SerializedLambda to org.apache.spark.api.java.function.Function in instance of org.apache.spark.api.java.JavaPairRDD$$anonfun$toScalaFunction$1 - posted by Mohamed Nadjib Mami <ma...@iai.uni-bonn.de> on 2015/12/06 11:29:00 UTC, 0 replies.
- No support to save DataFrame in existing database table using DataFrameWriter.jdbc() - posted by unk1102 <um...@gmail.com> on 2015/12/06 11:54:10 UTC, 1 replies.
- Spark sql data frames do they run in parallel by default? - posted by "Kali.tummala@gmail.com" <Ka...@gmail.com> on 2015/12/06 14:55:06 UTC, 1 replies.
- Spark GraphX default Storage Level - posted by prasad223 <pr...@gmail.com> on 2015/12/06 15:50:07 UTC, 0 replies.
- Implementing fail-fast upon critical spark streaming tasks errors - posted by yam <yo...@playtech.com> on 2015/12/06 16:11:00 UTC, 1 replies.
- fail-fast or retry failed spark streaming jobs - posted by yam <yo...@playtech.com> on 2015/12/06 17:04:31 UTC, 0 replies.
- Support for custom serializers in Checkpoint - posted by "Sela, Amit" <AN...@paypal.com.INVALID> on 2015/12/06 17:11:08 UTC, 0 replies.
- State management in spark-streaming - posted by Hafiz Mujadid <ha...@gmail.com> on 2015/12/06 18:20:38 UTC, 2 replies.
- spark-shell launch not clean - posted by Navdeep Kainth <na...@hotmail.com> on 2015/12/06 20:44:30 UTC, 0 replies.
- mllib.recommendations.als recommendForAll not ported to ml? - posted by guillaume <gu...@schibsted.com> on 2015/12/06 20:59:31 UTC, 1 replies.
- PySpark RDD with NumpyArray Structure - posted by Mustafa Elbehery <el...@gmail.com> on 2015/12/06 21:59:57 UTC, 1 replies.
- String Manipulation/Agregation - posted by Shige Song <sh...@gmail.com> on 2015/12/07 04:06:06 UTC, 0 replies.
- Could not load shims in class org.apache.hadoop.hive.schshim.FairSchedulerShim - posted by zhangjp <59...@qq.com> on 2015/12/07 04:35:52 UTC, 1 replies.
- how often you use Tachyon to accelerate Spark - posted by Arvin <ar...@gmail.com> on 2015/12/07 04:52:27 UTC, 0 replies.
- Re: parquet file doubts - posted by Cheng Lian <li...@databricks.com> on 2015/12/07 05:01:32 UTC, 5 replies.
- Task Time is too high in a single executor in Streaming - posted by SRK <sw...@gmail.com> on 2015/12/07 05:42:06 UTC, 0 replies.
- Intersection of two sets by key - join vs filter + join - posted by Z Z <zo...@gmail.com> on 2015/12/07 06:41:20 UTC, 2 replies.
- Find all simple paths of a maximum specific length between two nodes of a graph. - posted by kauarba <ka...@gmail.com> on 2015/12/07 07:13:47 UTC, 0 replies.
- Inconsistent data in Cassandra - posted by Priya Ch <le...@gmail.com> on 2015/12/07 07:17:33 UTC, 1 replies.
- How to get the list of running applications and Cores/Memory in use? - posted by Haopu Wang <HW...@qilinsoft.com> on 2015/12/07 07:56:00 UTC, 0 replies.
- Task hung on SocketInputStream.socketRead0 when reading large a mount of data from AWS S3 - posted by Sa Xiao <sa...@gmail.com> on 2015/12/07 09:10:21 UTC, 0 replies.
- How to config the log in Spark - posted by Guillermo Ortiz <ko...@gmail.com> on 2015/12/07 09:38:02 UTC, 0 replies.
- Available options for Spark REST API - posted by sunil m <26...@gmail.com> on 2015/12/07 09:42:23 UTC, 1 replies.
- persist spark output in hive using DataFrame and saveAsTable API - posted by Divya Gehlot <di...@gmail.com> on 2015/12/07 10:28:59 UTC, 4 replies.
- How to use all available memory per worker? - posted by George Sigletos <si...@textkernel.nl> on 2015/12/07 12:08:32 UTC, 0 replies.
- how create hbase connect? - posted by censj <ce...@lotuseed.com> on 2015/12/07 12:56:47 UTC, 2 replies.
- Spark and Kafka Integration - posted by Prashant Bhardwaj <pr...@gmail.com> on 2015/12/07 13:16:25 UTC, 1 replies.
- Re: How to unpersist RDDs generated by ALS/MatrixFactorizationModel - posted by Ewan Higgs <ew...@ugent.be> on 2015/12/07 14:42:52 UTC, 2 replies.
- Obtaining metrics of an individual Spark job - posted by diplomatic Guru <di...@gmail.com> on 2015/12/07 15:01:35 UTC, 0 replies.
- [SPARK] Obtaining matrices of an individual Spark job - posted by diplomatic Guru <di...@gmail.com> on 2015/12/07 15:06:28 UTC, 0 replies.
- spark sql current time stamp function ? - posted by "Kali.tummala@gmail.com" <Ka...@gmail.com> on 2015/12/07 15:47:28 UTC, 8 replies.
- Re: In yarn-client mode, is it the driver or application master that issue commands to executors? - posted by Nisrina Luthfiyati <ni...@gmail.com> on 2015/12/07 16:01:38 UTC, 2 replies.
- Re: Where to implement synchronization is GraphX Pregel API - posted by Robineast <Ro...@xense.co.uk> on 2015/12/07 16:33:25 UTC, 0 replies.
- Spark sql random number or sequence numbers ? - posted by "Kali.tummala@gmail.com" <Ka...@gmail.com> on 2015/12/07 16:40:29 UTC, 1 replies.
- FW: Managed to make Hive run on Spark engine - posted by Mich Talebzadeh <mi...@peridale.co.uk> on 2015/12/07 16:50:01 UTC, 1 replies.
- How to change StreamingContext batch duration after loading from checkpoint - posted by yam <yo...@playtech.com> on 2015/12/07 17:23:46 UTC, 2 replies.
- How to create dataframe from SQL Server SQL query - posted by "Wang, Ningjun (LNG-NPV)" <ni...@lexisnexis.com> on 2015/12/07 17:26:13 UTC, 3 replies.
- Re: Shared memory between C++ process and Spark - posted by Robin East <ro...@xense.co.uk> on 2015/12/07 17:54:07 UTC, 15 replies.
- Removing duplicates from dataframe - posted by Ro...@thomsonreuters.com on 2015/12/07 19:12:11 UTC, 4 replies.
- How to build Spark with Ganglia to enable monitoring using Ganglia - posted by SRK <sw...@gmail.com> on 2015/12/07 19:13:03 UTC, 1 replies.
- SparkSQL AVRO - posted by Test One <t1...@cksworks.com> on 2015/12/07 19:27:58 UTC, 2 replies.
- Spark on hbase using Phoenix in secure cluster - posted by Akhilesh Pathodia <pa...@gmail.com> on 2015/12/07 20:54:44 UTC, 3 replies.
- Example of a Trivial Custom PySpark Transformer - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2015/12/07 21:54:15 UTC, 0 replies.
- issue creating pyspark Transformer UDF that creates a LabeledPoint: AttributeError: 'DataFrame' object has no attribute '_get_object_id' - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2015/12/07 23:13:01 UTC, 0 replies.
- Local Mode: Executor thread leak? - posted by Richard Marscher <rm...@localytics.com> on 2015/12/07 23:30:07 UTC, 4 replies.
- Spark SQL - saving to multiple partitions in parallel - FileNotFoundException on _temporary directory possible bug? - posted by Deenar Toraskar <de...@gmail.com> on 2015/12/08 00:16:30 UTC, 1 replies.
- python rdd.partionBy(): any examples of a custom partitioner? - posted by Keith Freeman <8f...@gmail.com> on 2015/12/08 03:07:40 UTC, 1 replies.
- Best way to save key-value pair rdd ? - posted by Anup Sawant <an...@gmail.com> on 2015/12/08 03:58:50 UTC, 0 replies.
- Kryo Serialization in Spark - posted by prasad223 <pr...@gmail.com> on 2015/12/08 04:01:33 UTC, 1 replies.
- Unable to acces hive table (created through hive context) in hive console - posted by Divya Gehlot <di...@gmail.com> on 2015/12/08 07:12:28 UTC, 1 replies.
- NoSuchMethodError: com.fasterxml.jackson.databind.ObjectMapper.enable - posted by Sunil Tripathy <su...@outlook.com> on 2015/12/08 07:26:41 UTC, 2 replies.
- HiveContext creation failed with Kerberos - posted by Neal Yin <ne...@workday.com> on 2015/12/08 07:52:09 UTC, 2 replies.
- Spark with MapDB - posted by Ramkumar V <ra...@gmail.com> on 2015/12/08 08:52:27 UTC, 6 replies.
- what's the way to access the last element from another partition - posted by Zhiliang Zhu <zc...@yahoo.com.INVALID> on 2015/12/08 10:23:16 UTC, 0 replies.
- bad performance on PySpark - big text file - posted by patcharee <Pa...@uni.no> on 2015/12/08 10:26:57 UTC, 0 replies.
- flatMap function in Spark - posted by Sateesh Karuturi <sa...@gmail.com> on 2015/12/08 12:04:14 UTC, 1 replies.
- Logging spark output to hdfs file - posted by sunil m <26...@gmail.com> on 2015/12/08 13:00:10 UTC, 1 replies.
- Need to maintain the consumer offset by myself when using spark streaming kafka direct approach? - posted by Tao Li <li...@gmail.com> on 2015/12/08 13:05:35 UTC, 3 replies.
- PySpark reading from Postgres tables with UUIDs - posted by Chris Elsmore <ch...@demandlogic.co.uk> on 2015/12/08 13:15:44 UTC, 0 replies.
- Comparisons between Ganglia and Graphite for monitoring the Streaming Cluster? - posted by SRK <sw...@gmail.com> on 2015/12/08 13:17:00 UTC, 1 replies.
- Re: Can not see any spark metrics on ganglia-web - posted by SRK <sw...@gmail.com> on 2015/12/08 13:36:40 UTC, 2 replies.
- is repartition very cost - posted by Zhiliang Zhu <zc...@yahoo.com.INVALID> on 2015/12/08 14:04:53 UTC, 3 replies.
- actors and async communication between driver and workers/executors - posted by Manolis Sifalakis1 <EM...@zurich.ibm.com> on 2015/12/08 15:06:59 UTC, 0 replies.
- Re: hive thriftserver and fair scheduling - posted by Deenar Toraskar <de...@gmail.com> on 2015/12/08 15:15:33 UTC, 0 replies.
- epoch date time problem to load data into in spark - posted by Soni spark <so...@gmail.com> on 2015/12/08 15:20:50 UTC, 0 replies.
- groupByKey() - posted by Yasemin Kaya <go...@gmail.com> on 2015/12/08 15:31:35 UTC, 0 replies.
- Graph visualization tool for GraphX - posted by "Lin, Hao" <Ha...@finra.org> on 2015/12/08 16:46:36 UTC, 5 replies.
- Re: epoch date format to normal date format while loading the files to HDFS - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2015/12/08 17:13:38 UTC, 0 replies.
- SparkR read.df failed to read file from local directory - posted by Boyu Zhang <bo...@gmail.com> on 2015/12/08 17:47:42 UTC, 3 replies.
- Associating spark jobs with logs - posted by sunil m <26...@gmail.com> on 2015/12/08 17:57:35 UTC, 2 replies.
- Re: Can't create UDF's in spark 1.5 while running using the hive thrift service - posted by Deenar Toraskar <de...@gmail.com> on 2015/12/08 17:58:54 UTC, 1 replies.
- can i write only RDD transformation into hdfs or any other storage system - posted by prateek arora <pr...@gmail.com> on 2015/12/08 18:40:37 UTC, 2 replies.
- Merge rows into csv - posted by Krishna <re...@gmail.com> on 2015/12/08 18:47:42 UTC, 1 replies.
- Spark metrics not working - posted by Jesse F Chen <jf...@us.ibm.com> on 2015/12/08 20:45:34 UTC, 0 replies.
- INotifyDStream - where to find it? - posted by octagon blue <oc...@fastmail.com> on 2015/12/08 22:25:03 UTC, 1 replies.
- Re: Exception in Spark-sql insertIntoJDBC command - posted by "Kali.tummala@gmail.com" <Ka...@gmail.com> on 2015/12/08 23:04:43 UTC, 0 replies.
- spark-defaults.conf optimal configuration - posted by cjrumble <cj...@gmail.com> on 2015/12/08 23:22:27 UTC, 2 replies.
- Re: Spark metrics for ganglia - posted by swetha kasireddy <sw...@gmail.com> on 2015/12/09 04:06:27 UTC, 0 replies.
- How to get custom metrics using Ganglia Sink? - posted by SRK <sw...@gmail.com> on 2015/12/09 04:15:28 UTC, 0 replies.
- Re: Spark Java.lang.NullPointerException - posted by michael_han <mi...@hotmail.com> on 2015/12/09 04:18:01 UTC, 2 replies.
- set up spark 1.4.1 as default spark engine in HDP 2.2/2.3 - posted by Divya Gehlot <di...@gmail.com> on 2015/12/09 04:28:12 UTC, 1 replies.
- RE: Executor metrics in spark application - posted by SRK <sw...@gmail.com> on 2015/12/09 06:36:52 UTC, 0 replies.
- How to use collections inside foreach block - posted by Madabhattula Rajesh Kumar <mr...@gmail.com> on 2015/12/09 06:49:40 UTC, 3 replies.
- About Spark On Hbase - posted by censj <ce...@lotuseed.com> on 2015/12/09 08:04:38 UTC, 10 replies.
- 回复: Re: About Spark On Hbase - posted by "fightfate@163.com" <fi...@163.com> on 2015/12/09 08:42:32 UTC, 1 replies.
- Differences between Spark APIs for Hadoop 1.x and Hadoop 2.x in terms of performance, progress reporting and IO metrics. - posted by Hyukjin Kwon <gu...@gmail.com> on 2015/12/09 10:01:12 UTC, 2 replies.
- getting error while persisting in hive - posted by Divya Gehlot <di...@gmail.com> on 2015/12/09 10:37:01 UTC, 1 replies.
- Filtering records based on empty value of column in SparkSql - posted by Prashant Bhardwaj <pr...@gmail.com> on 2015/12/09 12:43:45 UTC, 7 replies.
- HiveContext.read.orc - buffer size not respected after setting it - posted by Fabian Böhnlein <fa...@gmail.com> on 2015/12/09 12:52:57 UTC, 0 replies.
- ALS with repeated entries - posted by Roberto Pagliari <ro...@asos.com> on 2015/12/09 14:07:30 UTC, 0 replies.
- spark data frame write.mode("append") bug - posted by "Kali.tummala@gmail.com" <Ka...@gmail.com> on 2015/12/09 14:54:49 UTC, 4 replies.
- How to interpret executorRunTime? - posted by "Saraswat, Sandeep" <sa...@hpe.com> on 2015/12/09 14:57:43 UTC, 0 replies.
- Spark Stream Monitoring with Kafka Direct API - posted by Dan Dutrow <da...@gmail.com> on 2015/12/09 16:39:36 UTC, 4 replies.
- RDD.isEmpty - posted by Pat Ferrel <pa...@occamsmachete.com> on 2015/12/09 17:41:14 UTC, 8 replies.
- Unsubsribe - posted by Michael Nolting <mi...@sevenval.com> on 2015/12/09 17:48:43 UTC, 1 replies.
- Content based window operation on Time-series data - posted by Arun Verma <ar...@gmail.com> on 2015/12/09 17:54:15 UTC, 4 replies.
- [Spark-1.5.2][Hadoop-2.6][Spark SQL] Cannot run queries in SQLContext, getting java.lang.NoSuchMethodError - posted by Matheus Ramos <ma...@gmail.com> on 2015/12/09 19:19:56 UTC, 3 replies.
- Mesos scheduler obeying limit of tasks / executor - posted by Charles Allen <ch...@metamarkets.com> on 2015/12/09 19:23:07 UTC, 2 replies.
- SparkStreaming variable scope - posted by jpinela <pi...@gmail.com> on 2015/12/09 19:23:25 UTC, 4 replies.
- RegressionModelEvaluator (from jpmml) NotSerializableException when instantiated in the driver - posted by Utkarsh Sengar <ut...@gmail.com> on 2015/12/09 20:01:25 UTC, 1 replies.
- can i process multiple batch in parallel in spark streaming - posted by prateek arora <pr...@gmail.com> on 2015/12/09 20:12:02 UTC, 3 replies.
- Recursive nested wildcard directory walking in Spark - posted by James Ding <jd...@palantir.com> on 2015/12/09 20:18:04 UTC, 2 replies.
- Multiple drivers, same worker - posted by "andresbm84@gmail.com" <an...@gmail.com> on 2015/12/09 20:33:55 UTC, 5 replies.
- SparkML. RandomForest predict performance for small dataset. - posted by Eugene Morozov <ev...@gmail.com> on 2015/12/09 21:37:58 UTC, 1 replies.
- Release data for spark 1.6? - posted by "Kali.tummala@gmail.com" <Ka...@gmail.com> on 2015/12/09 22:20:40 UTC, 6 replies.
- HTTP Source for Spark Streaming - posted by Sourav Mazumder <so...@gmail.com> on 2015/12/09 22:27:35 UTC, 2 replies.
- Cause of akka.pattern.AskTimeoutException - posted by Rachana Srivastava <Ra...@markmonitor.com> on 2015/12/09 22:45:00 UTC, 1 replies.
- Is Spark History Server supported for Mesos? - posted by Kelvin Chu <2d...@gmail.com> on 2015/12/09 23:01:18 UTC, 1 replies.
- Re: Saving RDDs in Tachyon - posted by Calvin Jia <ji...@gmail.com> on 2015/12/09 23:27:22 UTC, 0 replies.
- Re: spark shared RDD - posted by Calvin Jia <ji...@gmail.com> on 2015/12/09 23:45:00 UTC, 0 replies.
- Re: Re: Spark RDD cache persistence - posted by Calvin Jia <ji...@gmail.com> on 2015/12/09 23:51:13 UTC, 0 replies.
- StackOverflowError when writing dataframe to table - posted by "apu mishra . rr" <ap...@gmail.com> on 2015/12/09 23:59:04 UTC, 1 replies.
- distcp suddenly broken with spark-ec2 script setup - posted by AlexG <sw...@gmail.com> on 2015/12/10 01:24:50 UTC, 1 replies.
- count distinct in spark sql aggregation - posted by "fightfate@163.com" <fi...@163.com> on 2015/12/10 02:36:08 UTC, 0 replies.
- how to reference aggregate columns - posted by skaarthik oss <sk...@gmail.com> on 2015/12/10 03:05:00 UTC, 1 replies.
- IP error on starting spark-shell on windows 7 - posted by Stefan Karos <to...@gmail.com> on 2015/12/10 05:19:28 UTC, 1 replies.
- Spark 1.5.2 error on quitting spark in windows 7 - posted by skypickle <to...@gmail.com> on 2015/12/10 05:54:50 UTC, 0 replies.
- sortByKey not spilling to disk? (PySpark 1.3) - posted by YaoPau <jo...@gmail.com> on 2015/12/10 06:28:27 UTC, 0 replies.
- Schedular delay in spark 1.4 - posted by Renu Yadav <yr...@gmail.com> on 2015/12/10 06:29:32 UTC, 0 replies.
- GLM in apache spark in MLlib - posted by Arunkumar Pillai <ar...@gmail.com> on 2015/12/10 07:54:52 UTC, 1 replies.
- hbase Put object kryo serialisation error - posted by Shushant Arora <sh...@gmail.com> on 2015/12/10 07:59:26 UTC, 0 replies.
- Help: Get Timeout error and FileNotFoundException when shuffling large files - posted by kendal <ke...@163.com> on 2015/12/10 10:37:44 UTC, 2 replies.
- Re: About the bottleneck of parquet file reading in Spark - posted by Cheng Lian <li...@databricks.com> on 2015/12/10 10:38:22 UTC, 0 replies.
- RE: FileNotFoundException in appcache shuffle files - posted by kendal <ke...@163.com> on 2015/12/10 10:47:45 UTC, 1 replies.
- Re: Spark groupby and agg inconsistent and missing data - posted by Kapil Raaj <ca...@gmail.com> on 2015/12/10 10:49:29 UTC, 0 replies.
- Apache spark Web UI on Amazon EMR not working - posted by sonal sharma <so...@gmail.com> on 2015/12/10 11:40:04 UTC, 0 replies.
- Can't filter - posted by Бобров Виктор <ma...@bk.ru> on 2015/12/10 12:10:36 UTC, 7 replies.
- Inverse of the matrix - posted by Arunkumar Pillai <ar...@gmail.com> on 2015/12/10 12:33:24 UTC, 1 replies.
- example of querying LDA model - posted by Olga Syrova <Ol...@stepstone.de> on 2015/12/10 12:40:08 UTC, 0 replies.
- Delays between Stage Submit and First Task Submit - posted by Matthias Niehoff <ma...@codecentric.de> on 2015/12/10 12:52:08 UTC, 1 replies.
- Error Handling approach for SparkSQL queries in Spark version 1.4 - posted by satish chandra j <js...@gmail.com> on 2015/12/10 13:46:13 UTC, 0 replies.
- Re: HELP! I get "java.lang.String cannot be cast to java.lang.Intege " for a long time. - posted by Bonsen <he...@126.com> on 2015/12/10 15:38:33 UTC, 6 replies.
- Spark on EMR: out-of-the-box solution for real-time application logs monitoring? - posted by Roberto Coluccio <ro...@gmail.com> on 2015/12/10 15:52:53 UTC, 2 replies.
- Spark Streaming Kinesis - DynamoDB Streams compatability - posted by Nick Pentreath <ni...@gmail.com> on 2015/12/10 16:04:27 UTC, 0 replies.
- Spark streaming with Kinesis broken? - posted by Brian London <br...@gmail.com> on 2015/12/10 16:50:27 UTC, 12 replies.
- Spark job submission REST API - posted by mvle <mv...@us.ibm.com> on 2015/12/10 17:26:35 UTC, 4 replies.
- Warning: Master endpoint spark://ip:7077 was not a REST server. Falling back to legacy submission gateway instead. - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2015/12/10 17:52:56 UTC, 3 replies.
- Re: Rule Engine for Spark - posted by Luciano Resende <lu...@gmail.com> on 2015/12/10 17:53:52 UTC, 0 replies.
- How to make this Spark 1.5.2 code fast and shuffle less data - posted by unk1102 <um...@gmail.com> on 2015/12/10 18:57:10 UTC, 4 replies.
- Workflow manager for Spark and Spark SQL - posted by Alexander Pivovarov <ap...@gmail.com> on 2015/12/10 19:50:36 UTC, 1 replies.
- Structured Vector Format - posted by Hayri Volkan Agun <vo...@gmail.com> on 2015/12/10 19:53:39 UTC, 0 replies.
- Spark 1.3.1 - Does SparkConext in multi-threaded env requires SparkEnv.set(env) anymore - posted by Nirav Patel <np...@xactlycorp.com> on 2015/12/10 20:03:15 UTC, 1 replies.
- [mesos][docker] addFile doesn't work properly - posted by "PHELIPOT, REMY" <re...@atos.net> on 2015/12/10 21:04:26 UTC, 1 replies.
- Replaying an RDD in spark streaming to update an accumulator - posted by AliGouta <al...@gmail.com> on 2015/12/10 21:05:28 UTC, 2 replies.
- memory leak when saving Parquet files in Spark - posted by Matt K <ma...@gmail.com> on 2015/12/10 22:33:39 UTC, 3 replies.
- Re: DataFrame creation delay? - posted by Isabelle Phan <nl...@gmail.com> on 2015/12/10 23:28:49 UTC, 2 replies.
- architecture though experiment: what is the advantage of using kafka with spark streaming? - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2015/12/11 03:00:10 UTC, 0 replies.
- Re: DataFrame: Compare each row to every other row? - posted by manasdebashiskar <po...@gmail.com> on 2015/12/11 04:29:14 UTC, 0 replies.
- Re: "Address already in use" after many streams on Kafka - posted by manasdebashiskar <po...@gmail.com> on 2015/12/11 04:55:39 UTC, 0 replies.
- Re: Does Spark SQL have to scan all the columns of a table in text format? - posted by manasdebashiskar <po...@gmail.com> on 2015/12/11 04:58:08 UTC, 0 replies.
- Re: How to control number of parquet files generated when using partitionBy - posted by manasdebashiskar <po...@gmail.com> on 2015/12/11 05:06:56 UTC, 0 replies.
- Re: architecture though experiment: what is the advantage of using kafka with spark streaming? - posted by Cody Koeninger <co...@koeninger.org> on 2015/12/11 06:03:44 UTC, 0 replies.
- org.apache.spark.SparkException: Task failed while writing rows.+ Spark output data to hive table - posted by Divya Gehlot <di...@gmail.com> on 2015/12/11 06:53:32 UTC, 0 replies.
- Spark assembly in Maven repo? - posted by Xiaoyong Zhu <xi...@microsoft.com> on 2015/12/11 07:46:12 UTC, 10 replies.
- Using TestHiveContext/HiveContext in unit tests - posted by Sahil Sareen <sa...@gmail.com> on 2015/12/11 11:06:48 UTC, 1 replies.
- spark metrics in graphite missing for some executors - posted by rok <ro...@gmail.com> on 2015/12/11 11:16:03 UTC, 0 replies.
- coGroup problem /spark streaming - posted by "Vieru, Mihail" <mi...@zalando.de> on 2015/12/11 12:07:53 UTC, 0 replies.
- Creation of RDD in foreachAsync is failing - posted by Madabhattula Rajesh Kumar <mr...@gmail.com> on 2015/12/11 14:23:42 UTC, 0 replies.
- What is the relationship between reduceByKey and spark.driver.maxResultSize? - posted by Tom Seddon <mr...@gmail.com> on 2015/12/11 15:14:59 UTC, 2 replies.
- spark-submit problems with --packages and --deploy-mode cluster - posted by Greg Hill <gr...@RACKSPACE.COM> on 2015/12/11 16:18:29 UTC, 0 replies.
- Compiling ERROR for Spark MetricsSystem - posted by Haijia Zhou <ha...@adobe.com> on 2015/12/11 17:47:55 UTC, 1 replies.
- Spark Submit - java.lang.IllegalArgumentException: requirement failed - posted by "Afshartous, Nick" <na...@turbine.com> on 2015/12/11 17:49:27 UTC, 0 replies.
- Spark streaming driver java process RSS memory constantly increasing using cassandra driver - posted by Conor Fennell <co...@gmail.com> on 2015/12/11 18:10:39 UTC, 4 replies.
- how to access local file from Spark sc.textFile("file:///path to/myfile") - posted by "Lin, Hao" <Ha...@finra.org> on 2015/12/11 18:19:08 UTC, 10 replies.
- Performance does not increase as the number of workers increasing in cluster mode - posted by Wei Da <xw...@qq.com> on 2015/12/11 18:34:34 UTC, 1 replies.
- Window function in Spark SQL - posted by Sourav Mazumder <so...@gmail.com> on 2015/12/11 18:59:14 UTC, 3 replies.
- Re: Spark Submit - java.lang.IllegalArgumentException: requirement failed - posted by Jean-Baptiste Onofré <jb...@nanthrax.net> on 2015/12/11 19:01:44 UTC, 1 replies.
- cluster mode uses port 6066 Re: Warning: Master endpoint spark://ip:7077 was not a REST server. Falling back to legacy submission gateway instead. - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2015/12/11 19:41:51 UTC, 0 replies.
- Multi-core support per task in Spark - posted by Zhan Zhang <zz...@hortonworks.com> on 2015/12/11 19:46:05 UTC, 1 replies.
- HDFS - posted by shahid ashraf <sh...@trialx.com> on 2015/12/11 19:46:54 UTC, 1 replies.
- Questions on Kerberos usage with YARN and JDBC - posted by Mike Wright <mw...@snl.com> on 2015/12/11 19:50:16 UTC, 2 replies.
- UNSUBSCRIBE - posted by wi...@aim.com on 2015/12/11 22:33:44 UTC, 4 replies.
- imposed dynamic resource allocation - posted by Antony Mayi <an...@yahoo.com.INVALID> on 2015/12/11 23:01:38 UTC, 1 replies.
- Spark REST API shows Error 503 Service Unavailable - posted by prateek arora <pr...@gmail.com> on 2015/12/11 23:05:57 UTC, 5 replies.
- Classpath problem trying to use DataFrames - posted by Christopher Brady <ch...@oracle.com> on 2015/12/12 02:01:46 UTC, 2 replies.
- How to display column names in spark-sql output - posted by Ashwin Shankar <as...@gmail.com> on 2015/12/12 02:16:42 UTC, 1 replies.
- Concatenate a string to a Column of type string in DataFrame - posted by satish chandra j <js...@gmail.com> on 2015/12/12 09:01:03 UTC, 4 replies.
- Spark does not clean garbage in blockmgr folders on slaves if long running spark-shell is used - posted by Alexander Pivovarov <ap...@gmail.com> on 2015/12/12 10:14:51 UTC, 0 replies.
- why "cache table a as select * from b" will do shuffle,and create 2 stages. - posted by chenwu <an...@gmail.com> on 2015/12/12 12:30:51 UTC, 2 replies.
- How to use HProf to profile Spark CPU overhead - posted by Jia Zou <ja...@gmail.com> on 2015/12/12 20:36:32 UTC, 2 replies.
- Has the format of a spark jar file changes in 1.5 - posted by Steve Lewis <lo...@gmail.com> on 2015/12/12 21:36:38 UTC, 0 replies.
- could not find driver id for spark application - posted by Jade Liu <ja...@nor1.com> on 2015/12/13 02:08:35 UTC, 2 replies.
- 回复:Classpath problem trying to use DataFrames - posted by Ricky <49...@qq.com> on 2015/12/13 08:30:23 UTC, 0 replies.
- How to unpack the values of an item in a RDD so I can create a RDD with multiple items? - posted by Abhishek Shivkumar <ab...@gmail.com> on 2015/12/13 18:40:22 UTC, 1 replies.
- Use of rdd.zipWithUniqueId() in DStream - posted by Sourav Mazumder <so...@gmail.com> on 2015/12/13 19:18:25 UTC, 1 replies.
- How to save Multilayer Perceptron Classifier model. - posted by Vadim Gribanov <gr...@gmail.com> on 2015/12/13 19:31:41 UTC, 2 replies.
- Make Spark Streaming DFrame as SQL table - posted by Karthikeyan Muthukumar <mk...@gmail.com> on 2015/12/13 21:44:56 UTC, 0 replies.
- comment on table - posted by Jung <jb...@naver.com> on 2015/12/14 05:18:28 UTC, 1 replies.
- Graphx Spark Accumulator - posted by prasad223 <pr...@gmail.com> on 2015/12/14 06:28:42 UTC, 0 replies.
- [SparkR] Any reason why saveDF's mode is append by default ? - posted by Jeff Zhang <zj...@gmail.com> on 2015/12/14 08:58:09 UTC, 2 replies.
- Autoscaling of Spark YARN cluster - posted by Mingyu Kim <mk...@palantir.com> on 2015/12/14 09:57:24 UTC, 4 replies.
- manipulate schema inside a repeated column - posted by Samuel <sa...@gmail.com> on 2015/12/14 11:08:16 UTC, 0 replies.
- Kryo serialization fails when using SparkSQL and HiveContext - posted by "Linh M. Tran" <li...@gmail.com> on 2015/12/14 11:17:06 UTC, 1 replies.
- worker:java.lang.ClassNotFoundException: ttt.test$$anonfun$1 - posted by Bonsen <he...@126.com> on 2015/12/14 12:35:28 UTC, 0 replies.
- RuntimeException: Failed to check null bit for primitive int type - posted by zml张明磊 <mi...@Ctrip.com> on 2015/12/14 12:40:37 UTC, 1 replies.
- [SparkR] Is rdd in SparkR deprecated ? - posted by Jeff Zhang <zj...@gmail.com> on 2015/12/14 13:26:37 UTC, 2 replies.
- How do I link JavaEsSpark.saveToEs() to a sparkConf? - posted by Spark Enthusiast <sp...@yahoo.in> on 2015/12/14 13:52:31 UTC, 1 replies.
- Unsubscribe - posted by Roman Garcia <ro...@gmail.com> on 2015/12/14 15:06:33 UTC, 0 replies.
- Spark streaming: java.lang.ClassCastException: org.apache.spark.util.SerializableConfiguration ... on restart from checkpoint - posted by alberskib <al...@gmail.com> on 2015/12/14 16:33:53 UTC, 6 replies.
- Run ad-hoc queries at runtime against cached RDDs - posted by Krishna Rao <kr...@gmail.com> on 2015/12/14 17:19:16 UTC, 3 replies.
- How to Make Spark Streaming DStream as SQL table? - posted by MK <mk...@gmail.com> on 2015/12/14 18:41:16 UTC, 0 replies.
- Strange Set of errors - posted by Steve Lewis <lo...@gmail.com> on 2015/12/14 19:24:47 UTC, 0 replies.
- SparkML algos limitations question. - posted by Eugene Morozov <ev...@gmail.com> on 2015/12/14 19:52:22 UTC, 2 replies.
- Adding a UI Servlet Filter - posted by iamknome <ms...@gmail.com> on 2015/12/14 20:34:21 UTC, 1 replies.
- Saving to JDBC - posted by Bob Corsaro <rc...@gmail.com> on 2015/12/14 21:17:12 UTC, 1 replies.
- Discover SparkUI port for spark streaming job running in cluster mode - posted by Ashish Nigam <as...@gmail.com> on 2015/12/14 22:57:49 UTC, 3 replies.
- troubleshooting "Missing an output location for shuffle" - posted by Veljko Skarich <ve...@gmail.com> on 2015/12/14 23:32:54 UTC, 1 replies.
- ALS mllib.recommendation vs ml.recommendation - posted by Roberto Pagliari <ro...@asos.com> on 2015/12/15 00:22:29 UTC, 1 replies.
- Re: Problems w/YARN Spark Streaming app reading from Kafka - posted by Robert Towne <Ro...@WebTrends.com> on 2015/12/15 00:59:57 UTC, 0 replies.
- Spark Streaming having trouble writing checkpoint - posted by Robert Towne <Ro...@WebTrends.com> on 2015/12/15 01:49:19 UTC, 1 replies.
- Mllib Word2Vec vector representations are very high in value - posted by jxieeducation <jx...@gmail.com> on 2015/12/15 01:54:45 UTC, 0 replies.
- Database does not exist: (Spark-SQL ===> Hive) - posted by Gokula Krishnan D <em...@gmail.com> on 2015/12/15 04:14:43 UTC, 2 replies.
- what are the cons/drawbacks of a Spark DataFrames - posted by "email2dgk@gmail.com" <em...@gmail.com> on 2015/12/15 04:35:37 UTC, 3 replies.
- Re: Problem using User Defined Predicate pushdown with core RDD and parquet - UDP class not found - posted by chao chu <ch...@gmail.com> on 2015/12/15 05:24:48 UTC, 0 replies.
- how to make a dataframe of Array[Doubles] ? - posted by AlexG <sw...@gmail.com> on 2015/12/15 06:12:30 UTC, 2 replies.
- Linear Regression with OLS - posted by Arunkumar Pillai <ar...@gmail.com> on 2015/12/15 06:13:27 UTC, 1 replies.
- Spark parallelism with mapToPair - posted by Rabin Banerjee <ra...@ericsson.com> on 2015/12/15 07:44:41 UTC, 1 replies.
- mapValues Transformation (JavaPairRDD) - posted by Sushrut Ikhar <su...@gmail.com> on 2015/12/15 08:06:19 UTC, 2 replies.
- Mixing Long Run Periodic Update Jobs With Streaming Scoring - posted by atbrew <at...@gmail.com> on 2015/12/15 11:10:14 UTC, 1 replies.
- Unable to get json for application jobs in spark 1.5.0 - posted by rakesh rakshit <ih...@gmail.com> on 2015/12/15 12:16:46 UTC, 1 replies.
- Cluster mode dependent jars not working - posted by vimal dinakaran <vi...@gmail.com> on 2015/12/15 12:57:41 UTC, 2 replies.
- Comparison of serialized objects - posted by Max <mx...@gmx.net> on 2015/12/15 13:27:02 UTC, 0 replies.
- about spark on hbase - posted by censj <ce...@lotuseed.com> on 2015/12/15 13:38:56 UTC, 1 replies.
- Spark on YARN multitenancy - posted by David Fox <da...@gmail.com> on 2015/12/15 16:30:50 UTC, 3 replies.
- Securing objects on the thrift server - posted by Younes Naguib <Yo...@tritondigital.com> on 2015/12/15 17:25:17 UTC, 3 replies.
- Re: how to spark streaming application start working on next batch before completing on previous batch . - posted by Mukesh Jha <me...@gmail.com> on 2015/12/15 18:19:36 UTC, 2 replies.
- Can't create UDF through thriftserver, no error reported - posted by Antonio Piccolboni <an...@piccolboni.info> on 2015/12/15 20:14:26 UTC, 4 replies.
- UDAF support in PySpark? - posted by Wei Chen <we...@gmail.com> on 2015/12/15 20:45:55 UTC, 2 replies.
- Spark big rdd problem - posted by Eran Witkon <er...@gmail.com> on 2015/12/15 20:50:44 UTC, 6 replies.
- ideal number of executors per machine - posted by Veljko Skarich <ve...@gmail.com> on 2015/12/15 22:07:46 UTC, 6 replies.
- How to do map join in Spark SQL - posted by Alexander Pivovarov <ap...@gmail.com> on 2015/12/15 22:21:31 UTC, 4 replies.
- Hive on Spark - Error: Child process exited before connecting back - posted by Ophir Etzion <op...@foursquare.com> on 2015/12/15 23:26:41 UTC, 1 replies.
- java.lang.NoSuchMethodError while saving a random forest model Spark version 1.5 - posted by Rachana Srivastava <Ra...@markmonitor.com> on 2015/12/16 00:23:03 UTC, 2 replies.
- hiveContext: storing lookup of partitions - posted by Gourav Sengupta <go...@gmail.com> on 2015/12/16 01:06:27 UTC, 4 replies.
- PairRDD(K, L) to multiple files by key serializing each value in L before - posted by Daniel Valdivia <ho...@danielvaldivia.com> on 2015/12/16 02:05:15 UTC, 3 replies.
- security testing on spark ? - posted by Judy Nash <ju...@exchange.microsoft.com> on 2015/12/16 02:16:21 UTC, 0 replies.
- Pros and cons -Saving spark data in hive - posted by Divya Gehlot <di...@gmail.com> on 2015/12/16 03:04:33 UTC, 1 replies.
- How to keep long running spark-shell but avoid hitting Java Out of Memory Exception: PermGen Space - posted by yunshan <sa...@gmail.com> on 2015/12/16 03:13:09 UTC, 3 replies.
- YarnScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources - posted by zml张明磊 <mi...@Ctrip.com> on 2015/12/16 03:32:01 UTC, 1 replies.
- looking for Spark streaming unit example written in Java - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2015/12/16 03:36:18 UTC, 0 replies.
- looking for a easier way to count the number of items in a JavaDStream - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2015/12/16 03:54:39 UTC, 3 replies.
- Re: looking for Spark streaming unit example written in Java - posted by Ted Yu <yu...@gmail.com> on 2015/12/16 04:09:36 UTC, 1 replies.
- Benchmarking with multiple users in Spark - posted by Rajesh Balamohan <ra...@gmail.com> on 2015/12/16 05:27:15 UTC, 0 replies.
- Re: [Spark 1.5]: Exception in thread "broadcast-hash-join-2" java.lang.OutOfMemoryError: Java heap space -- Work in 1.4, but 1.5 doesn't - posted by Deenar Toraskar <de...@gmail.com> on 2015/12/16 07:21:20 UTC, 1 replies.
- Need clarifications in Regression - posted by Arunkumar Pillai <ar...@gmail.com> on 2015/12/16 07:35:42 UTC, 1 replies.
- Intercept in Linear Regression - posted by Arunkumar Pillai <ar...@gmail.com> on 2015/12/16 08:06:41 UTC, 1 replies.
- which aws instance type for shuffle performance - posted by Rastan Boroujerdi <ra...@gmail.com> on 2015/12/16 08:11:08 UTC, 2 replies.
- NPE in using AvroKeyValueInputFormat for newAPIHadoopFile - posted by Jinyuan Zhou <zh...@gmail.com> on 2015/12/16 08:44:22 UTC, 2 replies.
- Re: Hive error after update from 1.4.1 to 1.5.2 - posted by Ashwin Sai Shankar <as...@netflix.com.INVALID> on 2015/12/16 09:12:45 UTC, 1 replies.
- Re: Compiling spark 1.5.1 fails with scala.reflect.internal.Types$TypeError: bad symbolic reference. - posted by Simon Hafner <re...@gmail.com> on 2015/12/16 10:16:09 UTC, 0 replies.
- Preventing an RDD from shuffling - posted by sparkuser2345 <hm...@gmail.com> on 2015/12/16 11:23:49 UTC, 3 replies.
- SparkContext.cancelJob - what part of Spark uses it? Nothing in webUI to kill jobs? - posted by Jacek Laskowski <ja...@japila.pl> on 2015/12/16 14:15:38 UTC, 5 replies.
- PySpark Connection reset by peer: socket write error - posted by Surendran Duraisamy <su...@gmail.com> on 2015/12/16 15:50:20 UTC, 2 replies.
- WholeTextFile for 8000~ files - problem - posted by Eran Witkon <er...@gmail.com> on 2015/12/16 16:23:47 UTC, 1 replies.
- Trying to index document in Solr with Spark and solr-spark library - posted by Guillermo Ortiz <ko...@gmail.com> on 2015/12/16 16:26:22 UTC, 1 replies.
- Spark Streaming: How to specify deploy mode through configuration parameter? - posted by Saiph Kappa <sa...@gmail.com> on 2015/12/16 16:31:48 UTC, 5 replies.
- Re: Spark-shell connecting to Mesos stuck at sched.cpp - posted by Aaron <aa...@gmail.com> on 2015/12/16 17:06:03 UTC, 0 replies.
- Using Spark to process JSON with gzip filed - posted by Eran Witkon <er...@gmail.com> on 2015/12/16 17:32:22 UTC, 3 replies.
- SparkEx PiAverage: Re: How to meet nested loop on pairRdd? - posted by MegaLearn <ji...@megalearningllc.com> on 2015/12/16 17:33:55 UTC, 1 replies.
- Error while running a job in yarn-client mode - posted by sunil m <26...@gmail.com> on 2015/12/16 17:37:26 UTC, 1 replies.
- HiveContext Self join not reading from cache - posted by Gourav Sengupta <go...@gmail.com> on 2015/12/16 18:34:29 UTC, 6 replies.
- File not found error running query in spark-shell - posted by Ted Yu <yu...@gmail.com> on 2015/12/16 19:39:46 UTC, 4 replies.
- Setting the vote rate in a Random Forest in MLlib - posted by "Young, Matthew T" <ma...@intel.com> on 2015/12/16 19:46:07 UTC, 0 replies.
- Scala VS Java VS Python - posted by Daniel Valdivia <ho...@danielvaldivia.com> on 2015/12/16 20:54:34 UTC, 6 replies.
- Parquet datasource optimization for distinct query - posted by pnpritchard <ni...@falkonry.com> on 2015/12/16 21:17:57 UTC, 0 replies.
- RandomForestModel Save is throwing NoSuchMethodError with Spark Version 1.5x - posted by Rachana Srivastava <Ra...@markmonitor.com> on 2015/12/16 22:01:28 UTC, 0 replies.
- count(*) performance in Hive vs Spark DataFrames - posted by Christopher Brady <ch...@oracle.com> on 2015/12/16 22:19:08 UTC, 0 replies.
- Access row column by field name - posted by Daniel Valdivia <ho...@danielvaldivia.com> on 2015/12/17 03:58:52 UTC, 1 replies.
- Spark job dying when I submit through oozie - posted by Scott Gallavan <sg...@gmail.com> on 2015/12/17 04:00:59 UTC, 0 replies.
- Error getting response from spark driver rest APIs : java.lang.IncompatibleClassChangeError: Implementing class - posted by ihavethepotential <ih...@gmail.com> on 2015/12/17 05:39:30 UTC, 0 replies.
- MLlib: Feature Importances API - posted by Asim Jalis <as...@gmail.com> on 2015/12/17 06:41:37 UTC, 3 replies.
- spark master process shutdown for timeout - posted by yaoxiaohua <ya...@outlook.com> on 2015/12/17 07:32:50 UTC, 1 replies.
- Are there some solution to complete the transform category variables into dummy variable in scala or spark ? - posted by zml张明磊 <mi...@Ctrip.com> on 2015/12/17 09:00:57 UTC, 1 replies.
- java.io.FileNotFoundException(Too many open files) in Spark streaming - posted by Priya Ch <le...@gmail.com> on 2015/12/17 09:36:03 UTC, 7 replies.
- number of blocks in ALS/recommendation API - posted by Roberto Pagliari <ro...@asos.com> on 2015/12/17 12:17:12 UTC, 1 replies.
- Download Problem with Spark 1.5.2 pre-built for Hadoop 1.X - posted by abc123 <cn...@hotmail.de> on 2015/12/17 12:48:35 UTC, 1 replies.
- Some tasks take a long time to find local block - posted by patrick256 <pa...@gmail.com> on 2015/12/17 14:33:41 UTC, 0 replies.
- Dynamic jar loading - posted by amarouni <am...@talend.com> on 2015/12/17 15:53:51 UTC, 2 replies.
- Matrix Inverse - posted by Arunkumar Pillai <ar...@gmail.com> on 2015/12/17 16:36:41 UTC, 0 replies.
- Spark streaming: Consistency of multiple streams in Spark - posted by Ashwin <as...@gmail.com> on 2015/12/17 16:37:02 UTC, 0 replies.
- One task hangs and never finishes - posted by Daniel Haviv <da...@veracity-group.com> on 2015/12/17 16:55:18 UTC, 1 replies.
- How to submit spark job to YARN from scala code - posted by Saiph Kappa <sa...@gmail.com> on 2015/12/17 17:50:03 UTC, 3 replies.
- [SparkML] RandomForestModel vs PipelineModel API on a Driver. - posted by Eugene Morozov <ev...@gmail.com> on 2015/12/17 18:00:47 UTC, 0 replies.
- How to access resources added with SQL: ADD FILE - posted by Antonio Piccolboni <an...@piccolboni.info> on 2015/12/17 18:23:02 UTC, 0 replies.
- pyspark + kafka + streaming = NoSuchMethodError - posted by Christos Mantas <cm...@cslab.ece.ntua.gr> on 2015/12/17 19:10:36 UTC, 4 replies.
- unsubscribe - posted by Roman Garcia <ro...@gmail.com> on 2015/12/17 20:47:51 UTC, 0 replies.
- ​Spark 1.6 - YARN Cluster Mode - posted by syepes <sy...@gmail.com> on 2015/12/17 21:03:16 UTC, 1 replies.
- Can't run spark on yarn - posted by Eran Witkon <er...@gmail.com> on 2015/12/17 21:25:39 UTC, 2 replies.
- Re: Large number of conf broadcasts - posted by Prasad Ravilla <pr...@slalom.com> on 2015/12/17 21:35:57 UTC, 3 replies.
- Spark Path Wildcards Question - posted by Mark Vervuurt <m....@gmail.com> on 2015/12/17 21:53:13 UTC, 0 replies.
- ShippableVertexPartitionOps: Joining two VertexPartitions with different indexes is slow. - posted by Anderson de Andrade <ad...@gmail.com> on 2015/12/18 01:00:01 UTC, 0 replies.
- Writing output fails when spark.unsafe.offHeap is enabled - posted by Mayuresh Kunjir <ma...@cs.duke.edu> on 2015/12/18 02:04:38 UTC, 4 replies.
- Base ERROR - posted by censj <ce...@lotuseed.com> on 2015/12/18 02:57:04 UTC, 1 replies.
- Python 3.x support - posted by YaoPau <jo...@gmail.com> on 2015/12/18 04:26:14 UTC, 1 replies.
- seriazable error in apache spark job - posted by Pankaj Narang <pa...@gmail.com> on 2015/12/18 08:43:13 UTC, 1 replies.
- Is DataFrame.groupBy supposed to preserve order within groups? - posted by Timothée Carayol <ti...@gmail.com> on 2015/12/18 08:55:14 UTC, 2 replies.
- Difference between Local Hive Metastore server and A Hive-based Metastore server - posted by Divya Gehlot <di...@gmail.com> on 2015/12/18 08:57:10 UTC, 0 replies.
- Re: Yarn application ID for Spark job on Yarn - posted by Kyle Lin <ky...@gmail.com> on 2015/12/18 09:01:17 UTC, 4 replies.
- Error on using updateStateByKey - posted by Abhishek Anand <ab...@gmail.com> on 2015/12/18 09:12:25 UTC, 1 replies.
- Re: security testing on spark ? - posted by Akhil Das <ak...@sigmoidanalytics.com> on 2015/12/18 10:23:44 UTC, 0 replies.
- Difference between DataFrame.cache() and hiveContext.cacheTable()? - posted by Sahil Sareen <sa...@gmail.com> on 2015/12/18 11:28:31 UTC, 0 replies.
- how to turn off spark streaming gracefully ? - posted by "Kali.tummala@gmail.com" <Ka...@gmail.com> on 2015/12/18 12:04:56 UTC, 3 replies.
- Does calling sqlContext.cacheTable("oldTableName") remove the cached contents of the oldTable - posted by Sahil Sareen <sa...@gmail.com> on 2015/12/18 12:33:19 UTC, 6 replies.
- Limit of application submission to cluster - posted by Sa...@wellsfargo.com on 2015/12/18 14:52:18 UTC, 1 replies.
- Spark with log4j - posted by Kalpesh Jadhav <ka...@citiustech.com> on 2015/12/18 16:23:36 UTC, 7 replies.
- Joining DataFrames - Causing Cartesian Product - posted by Prasad Ravilla <pr...@slalom.com> on 2015/12/18 16:38:24 UTC, 1 replies.
- Spark batch getting hung up - posted by SRK <sw...@gmail.com> on 2015/12/18 17:25:45 UTC, 3 replies.
- Question about Spark Streaming checkpoint interval - posted by Lan Jiang <lj...@gmail.com> on 2015/12/18 17:26:20 UTC, 1 replies.
- Configuring log4j - posted by "Afshartous, Nick" <na...@turbine.com> on 2015/12/18 17:46:12 UTC, 1 replies.
- Re: Joining DataFrames - Causing Cartesian Product - posted by Ted Yu <yu...@gmail.com> on 2015/12/18 19:11:01 UTC, 1 replies.
- ALS predictAll does not generate all the user/item ratings - posted by Roberto Pagliari <ro...@asos.com> on 2015/12/18 21:42:26 UTC, 1 replies.
- hive on spark - posted by Ophir Etzion <op...@foursquare.com> on 2015/12/18 21:45:54 UTC, 1 replies.
- How to run multiple Spark jobs as a workflow that takes input from a Streaming job in Oozie - posted by SRK <sw...@gmail.com> on 2015/12/19 00:27:24 UTC, 2 replies.
- Re: "Ambiguous references" to a field set in a partitioned table AND the data - posted by sim <si...@swoop.com> on 2015/12/19 01:06:50 UTC, 0 replies.
- Spark Streaming, PySpark 1.3, randomly losing connection - posted by YaoPau <jo...@gmail.com> on 2015/12/19 07:39:37 UTC, 0 replies.
- how to fetch all of data from hbase table in spark java - posted by Sateesh Karuturi <sa...@gmail.com> on 2015/12/19 08:56:47 UTC, 1 replies.
- About Huawei-Spark/Spark-SQL-on-HBase - posted by censj <ce...@lotuseed.com> on 2015/12/19 10:14:59 UTC, 2 replies.
- I coded an example to use Twitter stream as a data source for Spark - posted by Amir Rahnama <am...@gmail.com> on 2015/12/19 13:17:33 UTC, 2 replies.
- How to map a HashMap containing vertex as key and edge as values into Spark RDD - posted by aparasur <pa...@gmail.com> on 2015/12/19 18:49:19 UTC, 0 replies.
- spark 1.5.2 memory leak? reading JSON - posted by Eran Witkon <er...@gmail.com> on 2015/12/19 22:55:51 UTC, 5 replies.
- Hive error when starting up spark-shell in 1.5.2 - posted by Marco Mistroni <mm...@gmail.com> on 2015/12/19 22:58:49 UTC, 3 replies.
- Pyspark SQL Join Failure - posted by Weiwei Zhang <wz...@dons.usfca.edu> on 2015/12/19 23:30:34 UTC, 1 replies.
- TaskCompletionListener and Exceptions - posted by Neelesh <ne...@gmail.com> on 2015/12/20 00:44:05 UTC, 3 replies.
- Fwd: Numpy and dynamic loading - posted by Abhinav M Kulkarni <ab...@gmail.com> on 2015/12/20 04:03:50 UTC, 4 replies.
- Re: 101 question on external metastore - posted by Deenar Toraskar <de...@gmail.com> on 2015/12/20 05:41:52 UTC, 1 replies.
- Getting an error in insertion to mysql through sparkcontext in java.. - posted by Sree Eedupuganti <sr...@inndata.in> on 2015/12/20 07:43:46 UTC, 1 replies.
- combining multiple JSON files to one DataFrame - posted by Eran Witkon <er...@gmail.com> on 2015/12/20 08:20:55 UTC, 3 replies.
- DataFrame operations - posted by Eran Witkon <er...@gmail.com> on 2015/12/20 12:40:39 UTC, 2 replies.
- error: not found: value StructType on 1.5.2 - posted by Eran Witkon <er...@gmail.com> on 2015/12/20 14:43:27 UTC, 2 replies.
- How to convert and RDD to DF? - posted by Eran Witkon <er...@gmail.com> on 2015/12/20 15:31:03 UTC, 3 replies.
- Re: Word2Vec distributed? - posted by Yao <yg...@ford.com> on 2015/12/20 18:38:47 UTC, 0 replies.
- Query partition keys for indexed parquet input - posted by ajackson92 <aj...@pobox.com> on 2015/12/20 22:44:16 UTC, 0 replies.
- pyspark streaming crashes - posted by Antony Mayi <an...@yahoo.com.INVALID> on 2015/12/20 23:05:02 UTC, 1 replies.
- Memory allocation for Broadcast values - posted by Pat Ferrel <pa...@occamsmachete.com> on 2015/12/21 01:20:03 UTC, 2 replies.
- custom schema in spark throwing error - posted by Divya Gehlot <di...@gmail.com> on 2015/12/21 03:56:57 UTC, 1 replies.
- Spark Streaming - Number of RDDs in Dstream - posted by Arun Patel <ar...@gmail.com> on 2015/12/21 04:04:07 UTC, 3 replies.
- Creating vectors from a dataframe - posted by Arunkumar Pillai <ar...@gmail.com> on 2015/12/21 06:44:14 UTC, 1 replies.
- Getting estimates and standard error using ml.LinearRegression - posted by Arunkumar Pillai <ar...@gmail.com> on 2015/12/21 06:47:27 UTC, 1 replies.
- create hive table in Spark with Java code - posted by Soni spark <so...@gmail.com> on 2015/12/21 08:03:00 UTC, 0 replies.
- How to implement statemachine functionality in apache-spark by python - posted by Esa Heikkinen <es...@student.tut.fi> on 2015/12/21 10:22:26 UTC, 0 replies.
- configure spark for hive context - posted by Divya Gehlot <di...@gmail.com> on 2015/12/21 11:05:14 UTC, 1 replies.
- Deployment and performance related queries for Spark and Cassandra - posted by Ashish Gadkari <as...@gmail.com> on 2015/12/21 11:09:28 UTC, 0 replies.
- rdd only with one partition - posted by Zhiliang Zhu <zc...@yahoo.com.INVALID> on 2015/12/21 11:47:44 UTC, 4 replies.
- number limit of map for spark - posted by Zhiliang Zhu <zc...@yahoo.com.INVALID> on 2015/12/21 12:44:15 UTC, 5 replies.
- [Beg for help] spark job with very low efficiency - posted by Zhiliang Zhu <zc...@yahoo.com.INVALID> on 2015/12/21 13:21:30 UTC, 2 replies.
- fishing for help! - posted by Eran Witkon <er...@gmail.com> on 2015/12/21 13:53:40 UTC, 6 replies.
- get parameters of spark-submit - posted by Bonsen <he...@126.com> on 2015/12/21 14:09:39 UTC, 1 replies.
- spark-submit for dependent jars - posted by Madabhattula Rajesh Kumar <mr...@gmail.com> on 2015/12/21 14:15:41 UTC, 5 replies.
- argparse with pyspark - posted by Roberto Pagliari <ro...@asos.com> on 2015/12/21 16:10:37 UTC, 0 replies.
- GMM with diagonal covariance matrix - posted by Jaonary Rabarisoa <ja...@gmail.com> on 2015/12/21 16:58:21 UTC, 0 replies.
- Problem with Spark Standalone - posted by luca_guerra <lg...@bitbang.com> on 2015/12/21 17:07:08 UTC, 7 replies.
- Difference in AUCs b/w Spark's GBT and sklearn's - posted by Yahoo_SK <sk...@yahoo.co.uk> on 2015/12/21 17:17:54 UTC, 0 replies.
- Using inteliJ for spark development - posted by Eran Witkon <er...@gmail.com> on 2015/12/21 18:21:02 UTC, 6 replies.
- is Kafka Hard to configure? Does it have a high cost of ownership? - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2015/12/21 18:31:47 UTC, 1 replies.
- Applicaiton Detail UI change - posted by carlilek <ca...@janelia.hhmi.org> on 2015/12/21 19:13:10 UTC, 3 replies.
- error writing to stdout - posted by carlilek <ca...@janelia.hhmi.org> on 2015/12/21 21:45:09 UTC, 1 replies.
- Kafka Latency - posted by Bryan <br...@gmail.com> on 2015/12/22 01:53:26 UTC, 0 replies.
- spark-submit is ignoring "--executor-cores" - posted by Siva <sb...@gmail.com> on 2015/12/22 02:08:20 UTC, 5 replies.
- trouble implementing Transformer and calling DataFrame.withColumn() - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2015/12/22 02:54:11 UTC, 2 replies.
- spark streaming updateStateByKey state is nonsupport other type except ClassTag such as list? - posted by "ouruia@cnsuning.com" <ou...@cnsuning.com> on 2015/12/22 03:51:41 UTC, 0 replies.
- Extract SSerr SStot from Linear Regression using ml package - posted by Arunkumar Pillai <ar...@gmail.com> on 2015/12/22 05:23:22 UTC, 2 replies.
- Fat jar can't find jdbc - posted by David Yerrington <da...@yerrington.net> on 2015/12/22 06:12:21 UTC, 6 replies.
- error while defining custom schema in Spark 1.5.0 - posted by Divya Gehlot <di...@gmail.com> on 2015/12/22 09:03:15 UTC, 2 replies.
- val listRDD =ssc.socketTextStream(localhost,9999) on Yarn - posted by prasadreddy <al...@gmail.com> on 2015/12/22 09:11:20 UTC, 1 replies.
- UnsupportedOperationException Schema for type String => Int is not supported - posted by zml张明磊 <mi...@Ctrip.com> on 2015/12/22 09:27:20 UTC, 0 replies.
- Writing partitioned Avro data to HDFS - posted by Jan Holmberg <ja...@perigeum.fi> on 2015/12/22 10:01:03 UTC, 7 replies.
- Apache spark certification pass percentage ? - posted by "Kali.tummala@gmail.com" <Ka...@gmail.com> on 2015/12/22 11:25:58 UTC, 1 replies.
- driver OOM due to io.netty.buffer items not getting finalized - posted by Antony Mayi <an...@yahoo.com.INVALID> on 2015/12/22 11:59:29 UTC, 4 replies.
- Client session timed out, have not heard from server in - posted by yaoxiaohua <ya...@outlook.com> on 2015/12/22 12:30:04 UTC, 4 replies.
- Tips for Spark's Random Forest slow performance - posted by Alexander Ratnikov <ra...@gmail.com> on 2015/12/22 14:57:23 UTC, 3 replies.
- How to handle categorical variables in Spark MLlib? - posted by Hokam Singh Chauhan <ho...@gmail.com> on 2015/12/22 16:13:07 UTC, 3 replies.
- Re: spark streaming updateStateByKey state is nonsupport other type except ClassTag such as list? - posted by Dean Wampler <de...@gmail.com> on 2015/12/22 17:46:02 UTC, 3 replies.
- Getting EOFException when using cloudera built spark 1.5.0. - posted by hokam chauhan <ho...@gmail.com> on 2015/12/22 18:18:51 UTC, 0 replies.
- Stand Alone Cluster - Strange issue - posted by Madabhattula Rajesh Kumar <mr...@gmail.com> on 2015/12/22 18:34:30 UTC, 3 replies.
- How to Parse & flatten JSON object in a text file using Spark & Scala into Dataframe - posted by raja kbv <ra...@yahoo.com.INVALID> on 2015/12/22 19:07:04 UTC, 3 replies.
- building a distributed k-d tree with spark - posted by Russ <ru...@yahoo.com.INVALID> on 2015/12/22 20:00:03 UTC, 0 replies.
- Spark data frame - posted by Gaurav Agarwal <ga...@gmail.com> on 2015/12/22 20:09:30 UTC, 4 replies.
- Classification model method not found - posted by njoshi <ni...@teamaol.com> on 2015/12/22 22:04:42 UTC, 2 replies.
- Do existing R packages work with SparkR data frames - posted by Duy Lan Nguyen <nd...@gmail.com> on 2015/12/22 22:50:27 UTC, 3 replies.
- Missing dependencies when submitting scala app - posted by Daniel Valdivia <ho...@danielvaldivia.com> on 2015/12/22 23:15:51 UTC, 2 replies.
- Can SqlContext be used inside mapPartitions - posted by SRK <sw...@gmail.com> on 2015/12/23 01:44:34 UTC, 2 replies.
- Re: should I file a bug? Re: trouble implementing Transformer and calling DataFrame.withColumn() - posted by Jeff Zhang <zj...@gmail.com> on 2015/12/23 02:20:39 UTC, 0 replies.
- Spark SQL 1.5.2 missing JDBC driver for PostgreSQL? - posted by b2k70 <bb...@gmail.com> on 2015/12/23 02:22:14 UTC, 7 replies.
- FW: spark 1.5.2 application UI static resources not found - posted by Tim Barthram <Ti...@iag.com.au> on 2015/12/23 02:24:14 UTC, 1 replies.
- Which Hive version should be used with Spark 1.5.2? - posted by Arthur Chan <ar...@gmail.com> on 2015/12/23 03:17:23 UTC, 1 replies.
- Can't read data correctly through beeline when data is save by HiveContext - posted by licl <li...@126.com> on 2015/12/23 03:25:50 UTC, 2 replies.
- Regarding spark in nemory - posted by Gaurav Agarwal <ga...@gmail.com> on 2015/12/23 04:34:44 UTC, 1 replies.
- Problem of submitting Spark task to cluster from eclipse IDE on Windows - posted by superbee84 <ho...@qq.com> on 2015/12/23 05:32:02 UTC, 3 replies.
- running lda in spark throws exception - posted by Li Li <fa...@gmail.com> on 2015/12/23 06:40:14 UTC, 1 replies.
- running spark application encouter an error (maven relative) - posted by zml张明磊 <mi...@Ctrip.com> on 2015/12/23 07:10:28 UTC, 0 replies.
- Re: Streaming json records from kafka ... how can I process ... help please :) - posted by Gideon <gi...@volcanodata.com> on 2015/12/23 09:05:26 UTC, 2 replies.
- error creating custom schema - posted by Divya Gehlot <di...@gmail.com> on 2015/12/23 10:47:03 UTC, 1 replies.
- Spark Streaming 1.5.2+Kafka+Python (docs) - posted by Vyacheslav Yanuk <vy...@codeminders.com> on 2015/12/23 14:24:54 UTC, 1 replies.
- Problem using limit clause in spark sql - posted by 汪洋 <ti...@icloud.com> on 2015/12/23 14:26:51 UTC, 8 replies.
- Spark Streaming 1.5.2+Kafka+Python. Strange reading - posted by Vyacheslav Yanuk <vy...@codeminders.com> on 2015/12/23 15:03:53 UTC, 1 replies.
- rdd split into new rdd - posted by Yasemin Kaya <go...@gmail.com> on 2015/12/23 16:11:21 UTC, 4 replies.
- Unable to create hive table using HiveContext - posted by Soni spark <so...@gmail.com> on 2015/12/23 16:24:40 UTC, 1 replies.
- DataFrameWriter.format(String) is there a list of options? - posted by Christopher Brady <ch...@oracle.com> on 2015/12/23 17:57:15 UTC, 2 replies.
- How to call mapPartitions on DataFrame? - posted by unk1102 <um...@gmail.com> on 2015/12/23 18:43:56 UTC, 0 replies.
- Using Java Function API with Java 8 - posted by rdpratti <de...@easternct.edu> on 2015/12/24 02:13:21 UTC, 1 replies.
- 回复: Problem of submitting Spark task to cluster from eclipse IDE on Windows - posted by 真·金蜂无双 <ho...@qq.com> on 2015/12/24 05:18:22 UTC, 0 replies.
- RE: How to Parse & flatten JSON object in a text file using Spark &Scala into Dataframe - posted by Bharathi Raja <ra...@yahoo.com.INVALID> on 2015/12/24 06:30:36 UTC, 3 replies.
- Extract compressed JSON withing JSON - posted by Eran Witkon <er...@gmail.com> on 2015/12/24 10:42:27 UTC, 1 replies.
- error in spark cassandra connector - posted by Vijay Kandiboyina <vi...@inndata.in> on 2015/12/24 11:06:46 UTC, 1 replies.
- Spark Streaming + Kafka + scala job message read issue - posted by vi...@wipro.com on 2015/12/24 11:21:39 UTC, 9 replies.
- How to contribute by picking up starter bugs - posted by lokeshkumar <lo...@dataken.net> on 2015/12/24 11:44:43 UTC, 2 replies.
- How to ignore case in dataframe groupby? - posted by Bharathi Raja <ra...@yahoo.com.INVALID> on 2015/12/24 13:04:30 UTC, 4 replies.
- RE: How to Parse & flatten JSON object in a text file using Spark&Scala into Dataframe - posted by Bharathi Raja <ra...@yahoo.com.INVALID> on 2015/12/24 13:06:09 UTC, 0 replies.
- Newbie Help for spark's not finding native hadoop warning - posted by Bilinmek Istemiyor <be...@gmail.com> on 2015/12/24 13:19:28 UTC, 2 replies.
- how to debug java.lang.IllegalArgumentException: object is not an instance of declaring class - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2015/12/24 18:55:21 UTC, 1 replies.
- Spark Streaming - print accumulators value every period as logs - posted by Roberto Coluccio <ro...@gmail.com> on 2015/12/25 04:22:21 UTC, 1 replies.
- Job Error:Actor not found for: ActorSelection[Anchor(akka.tcp://sparkDriver@130.1.10.108:23600/) - posted by donhoff_h <16...@qq.com> on 2015/12/25 07:26:43 UTC, 2 replies.
- How can I get the column data based on specific column name and then stored these data in array or list ? - posted by zml张明磊 <mi...@Ctrip.com> on 2015/12/25 08:33:12 UTC, 1 replies.
- 回复: How can I get the column data based on specific column name and then stored these data in array or list ? - posted by "fightfate@163.com" <fi...@163.com> on 2015/12/25 09:25:52 UTC, 0 replies.
- 回复: Job Error:Actor not found for: ActorSelection[Anchor(akka.tcp://sparkDriver@130.1.10.108:23600/) - posted by donhoff_h <16...@qq.com> on 2015/12/25 09:28:10 UTC, 1 replies.
- Struggling time by data - posted by Yasemin Kaya <go...@gmail.com> on 2015/12/25 09:53:59 UTC, 2 replies.
- REST Api not working in spark - posted by aman solanki <yo...@gmail.com> on 2015/12/25 11:43:21 UTC, 2 replies.
- java.sql.SQLException: Unsupported type -101 - posted by Madabhattula Rajesh Kumar <mr...@gmail.com> on 2015/12/25 11:55:31 UTC, 0 replies.
- Stuck with DataFrame df.select("select * from table"); - posted by Eugene Morozov <ev...@gmail.com> on 2015/12/25 14:34:21 UTC, 12 replies.
- ClassNotFoundException when executing spark jobs in standalone/cluster mode on Spark 1.5.2 - posted by Saiph Kappa <sa...@gmail.com> on 2015/12/25 23:57:05 UTC, 2 replies.
- Spark SQL UDF with Struct input parameters - posted by Deenar Toraskar <de...@gmail.com> on 2015/12/26 03:42:28 UTC, 1 replies.
- why one of Stage is into Skipped section instead of Completed - posted by Prem Spark <sp...@gmail.com> on 2015/12/26 05:41:20 UTC, 2 replies.
- number of executors in sparkR.init() - posted by Franc Carter <fr...@gmail.com> on 2015/12/26 06:23:24 UTC, 2 replies.
- Cassandra read throughput using DataStax connector in Spark - posted by Noorul Islam Kamal Malmiyoda <no...@noorul.com> on 2015/12/26 16:37:57 UTC, 0 replies.
- Re: Error getting response from spark driver rest APIs : java.lang.IncompatibleClassChangeError: Implementing class - posted by Hokam Singh Chauhan <ho...@gmail.com> on 2015/12/26 18:20:18 UTC, 0 replies.
- 1.5.2 prebuilt for 2.4 spark-submit standalone Python scripts not running - posted by peteranolaN <pe...@gmail.com> on 2015/12/27 01:27:10 UTC, 0 replies.
- ERROR server.TThreadPoolServer: Error occurred during processing of message - posted by Dasun Hegoda <da...@gmail.com> on 2015/12/27 06:09:44 UTC, 4 replies.
- partitioning json data in spark - posted by Նարեկ Գալստեան <ng...@gmail.com> on 2015/12/27 14:18:39 UTC, 7 replies.
- Is the GraphX Programming Guide wrong in the chapter of Join Opereators? - posted by Bo-Heng Chen <bo...@gmail.com> on 2015/12/27 16:56:08 UTC, 0 replies.
- Pattern type is incompatible with expected type - posted by pkhamutou <p....@gmail.com> on 2015/12/27 19:08:59 UTC, 3 replies.
- Passing parameters to spark SQL - posted by Ajaxx <aj...@pobox.com> on 2015/12/27 22:11:56 UTC, 3 replies.
- Can anyone explain Spark behavior for below? Kudos in Advance - posted by Prem Spark <sp...@gmail.com> on 2015/12/28 00:14:40 UTC, 1 replies.
- DataFrame Vs RDDs ... Which one to use When ? - posted by Divya Gehlot <di...@gmail.com> on 2015/12/28 03:18:14 UTC, 5 replies.
- DataFrame Save is writing just column names while saving - posted by Divya Gehlot <di...@gmail.com> on 2015/12/28 03:32:03 UTC, 3 replies.
- Inconsistent behavior of randomSplit in YARN mode - posted by Gaurav Kumar <ga...@gmail.com> on 2015/12/28 06:56:12 UTC, 2 replies.
- Opening Dynamic Scaling Executors on Yarn - posted by 顾亮亮 <gu...@qiyi.com> on 2015/12/28 07:00:09 UTC, 5 replies.
- Help: Driver OOM when shuffle large amount of data - posted by kendal <ke...@163.com> on 2015/12/28 07:02:25 UTC, 2 replies.
- returns empty result set when using TimestampType and NullType as StructType +DataFrame +Scala + Spark 1.4.1 - posted by Divya Gehlot <di...@gmail.com> on 2015/12/28 09:49:19 UTC, 1 replies.
- how to use sparkR or spark MLlib load csv file on hdfs then calculate covariance - posted by zhangjp <59...@qq.com> on 2015/12/28 10:21:49 UTC, 0 replies.
- Problem About Worker System.out - posted by David John <da...@outlook.com> on 2015/12/28 10:33:16 UTC, 2 replies.
- Using Spark for high concurrent load tasks - posted by Aliaksei Tsyvunchyk <at...@exadel.com> on 2015/12/28 10:34:19 UTC, 0 replies.
- Timestamp datatype in dataframe + Spark 1.4.1 - posted by Divya Gehlot <di...@gmail.com> on 2015/12/28 10:42:36 UTC, 3 replies.
- Re: how to use sparkR or spark MLlib load csv file on hdfs then calculate covariance - posted by Yanbo Liang <yb...@gmail.com> on 2015/12/28 11:30:13 UTC, 2 replies.
- Is there anyway to log properties from a Spark application - posted by alvarobrandon <al...@gmail.com> on 2015/12/28 13:18:46 UTC, 3 replies.
- Spark DataFrame callUdf does not compile? - posted by unk1102 <um...@gmail.com> on 2015/12/28 16:26:20 UTC, 5 replies.
- trouble understanding data frame memory usage³java.io.IOException: Unable to acquirememory² - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2015/12/28 23:25:44 UTC, 1 replies.
- Re: trouble understanding data frame memory usage ³java.io.IOException: Unable to acquire memory² - posted by Michael Armbrust <mi...@databricks.com> on 2015/12/28 23:41:28 UTC, 1 replies.
- what is the difference between coalese() and repartition() ?Re: trouble understanding data frame memory usage³java.io.IOException: Unable to acquirememory² - posted by Andy Davidson <An...@SantaCruzIntegration.com> on 2015/12/29 00:09:31 UTC, 0 replies.
- Can't submit job to stand alone cluster - posted by Daniel Valdivia <ho...@danielvaldivia.com> on 2015/12/29 00:16:25 UTC, 14 replies.
- Re: SPARK_CLASSPATH out, spark.executor.extraClassPath in? - posted by jiml <ji...@megalearningllc.com> on 2015/12/29 01:33:34 UTC, 0 replies.
- [Spakr1.4.1] StuctField for date column in CSV file while creating custom schema - posted by Divya Gehlot <di...@gmail.com> on 2015/12/29 04:32:45 UTC, 1 replies.
- Re: Spark submit does automatically upload the jar to cluster? - posted by jiml <ji...@megalearningllc.com> on 2015/12/29 05:08:08 UTC, 1 replies.
- map spark.driver.appUIAddress IP to different IP - posted by Divya Gehlot <di...@gmail.com> on 2015/12/29 06:58:36 UTC, 1 replies.
- Re: what is the difference between coalese() and repartition() ?Re: trouble understanding data frame memory usage ³java.io.IOException: Unable to acquire memory² - posted by Hyukjin Kwon <gu...@gmail.com> on 2015/12/29 07:54:17 UTC, 0 replies.
- 回复: how to use sparkR or spark MLlib load csv file on hdfs thencalculate covariance - posted by zhangjp <59...@qq.com> on 2015/12/29 08:20:33 UTC, 2 replies.
- Spark 1.5.2 compatible spark-cassandra-connector - posted by vi...@wipro.com on 2015/12/29 13:40:33 UTC, 3 replies.
- Task hang problem - posted by Darren Govoni <da...@ontrenet.com> on 2015/12/29 18:19:02 UTC, 2 replies.
- SparkSQL Hive orc snappy table - posted by Dawid Wysakowicz <wy...@gmail.com> on 2015/12/29 18:25:01 UTC, 3 replies.
- difference between ++ and Union of a RDD - posted by "email2dgk@gmail.com" <em...@gmail.com> on 2015/12/29 19:41:21 UTC, 3 replies.
- Executor deregistered after 2mins (mesos, 1.6.0-rc4) - posted by Adrian Bridgett <ad...@opensignal.com> on 2015/12/29 21:43:58 UTC, 4 replies.
- Zip data frames - posted by Daniel Siegmann <da...@teamaol.com> on 2015/12/30 01:47:31 UTC, 0 replies.
- 回复: trouble understanding data frame memory usage ³java.io.IOException: Unable to acquire memory² - posted by Davies Liu <da...@databricks.com> on 2015/12/30 01:51:26 UTC, 2 replies.
- Problem with WINDOW functions? - posted by vadimtk <ap...@gmail.com> on 2015/12/30 02:28:45 UTC, 10 replies.
- Does Spark SQL support rollup like HQL - posted by Yi Zhang <zh...@yahoo.com.INVALID> on 2015/12/30 04:40:37 UTC, 3 replies.
- [SparkSQL][Parquet] Read from nested parquet data - posted by lin <ku...@gmail.com> on 2015/12/30 10:48:58 UTC, 3 replies.
- Monitoring Spark HDFS Reads and Writes - posted by alvarobrandon <al...@gmail.com> on 2015/12/30 14:19:32 UTC, 2 replies.
- Spark MLLib KMeans Performance on Amazon EC2 M3.2xlarge - posted by Jia Zou <ja...@gmail.com> on 2015/12/30 15:20:18 UTC, 2 replies.
- DStream keyBy - posted by Brian London <br...@gmail.com> on 2015/12/30 17:13:06 UTC, 0 replies.
- Using Experminal Spark Features - posted by David Newberger <da...@wandcorp.com> on 2015/12/30 17:26:33 UTC, 1 replies.
- SparkSQL integration issue with AWS S3a - posted by KOSTIANTYN Kudriavtsev <ku...@gmail.com> on 2015/12/30 18:45:50 UTC, 14 replies.
- How to register a Tuple3 with KryoSerializer? - posted by Russ <ru...@yahoo.com.INVALID> on 2015/12/30 19:16:23 UTC, 1 replies.
- 2 of 20,675 Spark Streaming : Out put frequency different from read frequency in StatefulNetworkWordCount - posted by Soumitra Johri <so...@gmail.com> on 2015/12/30 21:00:42 UTC, 1 replies.
- Working offline with spark-core and sbt - posted by Ashic Mahtab <as...@live.com> on 2015/12/31 03:07:26 UTC, 1 replies.
- K means clustering in spark - posted by an...@gmail.com on 2015/12/31 05:52:29 UTC, 1 replies.
- Error:scalac: Error: assertion failed: List(object package$DebugNode, object package$DebugNode) - posted by zml张明磊 <mi...@Ctrip.com> on 2015/12/31 08:00:42 UTC, 0 replies.
- Error while starting Zeppelin Service in HDP2.3.2 - posted by Divya Gehlot <di...@gmail.com> on 2015/12/31 08:03:24 UTC, 0 replies.
- 转发: Error:scalac: Error: assertion failed: List(object package$DebugNode, object package$DebugNode) - posted by zml张明磊 <mi...@Ctrip.com> on 2015/12/31 08:15:48 UTC, 0 replies.
- Help me! Spark WebUI is corrupted! - posted by LinChen <m2...@outlook.com> on 2015/12/31 11:05:38 UTC, 1 replies.
- what is the proper number set about --num-executors etc - posted by Zhiliang Zhu <zc...@yahoo.com.INVALID> on 2015/12/31 12:32:25 UTC, 0 replies.
- Batch together RDDs for Streaming output, without delaying execution of map or transform functions - posted by Ewan Leith <ew...@realitymine.com> on 2015/12/31 12:35:37 UTC, 2 replies.
- Re: efficient checking the existence of an item in a rdd - posted by domibd <db...@lipn.univ-paris13.fr> on 2015/12/31 17:26:35 UTC, 1 replies.
- Apparent bug in KryoSerializer - posted by Russ <ru...@yahoo.com.INVALID> on 2015/12/31 18:49:54 UTC, 1 replies.
- pass custom spark-conf - posted by KOSTIANTYN Kudriavtsev <ku...@gmail.com> on 2015/12/31 19:48:41 UTC, 2 replies.
- Problem embedding GaussianMixtureModel in a closure - posted by Tomasz Fruboes <To...@ncbj.gov.pl> on 2015/12/31 21:12:05 UTC, 0 replies.