You are viewing a plain text version of this content. The canonical link for it is here.
- Re: DStream reduceByKeyAndWindow not using checkpointed data for inverse reducing old data - posted by N B <nb...@gmail.com> on 2018/09/01 09:19:33 UTC, 0 replies.
- Re: read snappy compressed files in spark - posted by "yujhe.li" <li...@gmail.com> on 2018/09/02 01:24:54 UTC, 0 replies.
- Reading mongoDB collection in Spark with arrays - posted by Mich Talebzadeh <mi...@gmail.com> on 2018/09/02 10:09:56 UTC, 1 replies.
- Set can be passed in as an input argument but not as output - posted by V0lleyBallJunki3 <ve...@gmail.com> on 2018/09/04 03:47:03 UTC, 0 replies.
- Spark hive udf: no handler for UDAF analysis exception - posted by Swapnil Chougule <th...@gmail.com> on 2018/09/04 10:50:33 UTC, 1 replies.
- Seeing a framework registration loop with Spark 2.3.1 on DCOS 1.10.0 - posted by David Hesson <da...@arcadia.io> on 2018/09/05 00:01:14 UTC, 0 replies.
- getting error: value toDF is not a member of Seq[columns] - posted by Mich Talebzadeh <mi...@gmail.com> on 2018/09/05 08:11:32 UTC, 15 replies.
- deploy-mode cluster. FileNotFoundException - posted by Guillermo Ortiz Fernández <gu...@gmail.com> on 2018/09/05 09:11:05 UTC, 2 replies.
- Padova Apache Spark Meetup - posted by Matteo Durighetto <m....@miriade.it> on 2018/09/05 12:44:18 UTC, 0 replies.
- Re: Unsubscribe - posted by Sunil Prabhakara <su...@gmail.com> on 2018/09/05 13:43:09 UTC, 7 replies.
- [ML] Setting Non-Transform Params for a Pipeline & PipelineModel - posted by Aleksander Eskilson <al...@gmail.com> on 2018/09/05 14:16:25 UTC, 0 replies.
- Spark Streaming RDD Cleanup too slow - posted by Prashant Sharma <pr...@plume.com> on 2018/09/06 04:13:57 UTC, 0 replies.
- How to make pyspark use custom python? - posted by mithril <tw...@gmail.com> on 2018/09/06 06:21:36 UTC, 3 replies.
- Re: CBO not working for Parquet Files - posted by emlyn <Em...@microsoft.com> on 2018/09/06 09:56:40 UTC, 0 replies.
- XGBoost Not distributing on cluster having more than 1 worker - posted by Aakash Basu <aa...@gmail.com> on 2018/09/06 10:05:53 UTC, 1 replies.
- Error in show() - posted by dimitris plakas <di...@gmail.com> on 2018/09/06 22:11:07 UTC, 3 replies.
- Re: [External Sender] Re: How to make pyspark use custom python? - posted by Femi Anthony <ol...@capitalone.com> on 2018/09/07 02:54:44 UTC, 1 replies.
- How to debug Spark job - posted by James Starks <su...@protonmail.com.INVALID> on 2018/09/07 09:47:58 UTC, 1 replies.
- Re: [External Sender] How to debug Spark job - posted by Femi Anthony <ol...@capitalone.com> on 2018/09/07 10:32:02 UTC, 2 replies.
- Spark job's driver programe consums too much memory - posted by James Starks <su...@protonmail.com.INVALID> on 2018/09/07 14:04:04 UTC, 4 replies.
- Driver OutOfMemoryError in MapOutputTracker$.serializeMapStatuses for 40 TB shuffle. - posted by Harel Gliksman <ha...@gmail.com> on 2018/09/07 14:34:49 UTC, 2 replies.
- [K8S] Driver and Executor Logging - posted by Rohit Menon <ro...@gmail.com> on 2018/09/07 17:31:39 UTC, 1 replies.
- Fwd: Using MongoDB as an Operational Data Store (ODS) with Spark Streaming - posted by Mich Talebzadeh <mi...@gmail.com> on 2018/09/07 18:34:51 UTC, 1 replies.
- How to retreive data from nested json use dataframe - posted by 阎志涛 <to...@tendcloud.com> on 2018/09/08 14:38:32 UTC, 1 replies.
- DataSourceReader and SupportPushDownFilters for Short types - posted by Hugh Hyndman <hu...@redleaf.ca> on 2018/09/08 23:17:49 UTC, 0 replies.
- PushDown Filter Not Reset - posted by Hugh Hyndman <hu...@redleaf.ca> on 2018/09/09 10:56:14 UTC, 0 replies.
- custom sink & model transformation - posted by Stavros Kontopoulos <st...@lightbend.com> on 2018/09/10 00:16:21 UTC, 0 replies.
- Re: Register UDF duration runtime - posted by amittonge <am...@dimentrix.com> on 2018/09/10 08:58:02 UTC, 0 replies.
- Speakers needed for Apache DC Roadshow - posted by Rich Bowen <rb...@apache.org> on 2018/09/11 15:05:52 UTC, 0 replies.
- Drawing Big Data tech diagrams using Pen Tablets - posted by Mich Talebzadeh <mi...@gmail.com> on 2018/09/11 19:20:54 UTC, 6 replies.
- [Help] Set nThread in Spark cluster - posted by Aakash Basu <aa...@gmail.com> on 2018/09/12 08:40:03 UTC, 0 replies.
- Fixing NullType for parquet files - posted by Stephen Boesch <ja...@gmail.com> on 2018/09/12 17:26:27 UTC, 0 replies.
- Trying to improve performance of the driver. - posted by Guillermo Ortiz Fernández <gu...@gmail.com> on 2018/09/13 15:48:05 UTC, 0 replies.
- Python Dependencies Issue on EMR - posted by Jonas Shomorony <js...@stanford.edu> on 2018/09/14 02:08:38 UTC, 2 replies.
- Local vs Cluster - posted by Aakash Basu <aa...@gmail.com> on 2018/09/14 08:21:28 UTC, 2 replies.
- Is there any open source framework that converts Cypher to SparkSQL? - posted by kant kodali <ka...@gmail.com> on 2018/09/14 09:42:39 UTC, 1 replies.
- DAGScheduler in SparkStreaming - posted by Guillermo Ortiz <ko...@gmail.com> on 2018/09/14 10:35:03 UTC, 0 replies.
- Spark2 DynamicAllocation doesn't release executors that used cache - posted by Sergejs Andrejevs <S....@intrum.com> on 2018/09/14 11:33:08 UTC, 1 replies.
- What is the best way for Spark to read HDF5@scale? - posted by kathleen li <ka...@gmail.com> on 2018/09/14 14:26:55 UTC, 1 replies.
- Re: StackOverflow Error when run ALS with 100 iterations - posted by LeoB <le...@gmail.com> on 2018/09/14 17:10:07 UTC, 0 replies.
- [SparkSQL] Count Distinct issue - posted by Daniele Foroni <da...@gmail.com> on 2018/09/14 18:54:16 UTC, 1 replies.
- [Spark SQL] Catalyst ScalaReflection/ExpressionEncoder fail with relocated (shaded) classes - posted by johkelly <ja...@fullcontact.com> on 2018/09/14 23:49:51 UTC, 0 replies.
- Should python-2 be supported in Spark 3.0? - posted by Erik Erlandson <ee...@redhat.com> on 2018/09/15 18:09:24 UTC, 13 replies.
- FlatMapGroupsFunction Without Running Out of Memory For Large Groups - posted by ddukek <di...@placed.com> on 2018/09/15 20:49:18 UTC, 0 replies.
- Support STS to run in k8s deployment with spark deployment mode as cluster - posted by "Garlapati, Suryanarayana (Nokia - IN/Bangalore)" <su...@nokia.com> on 2018/09/16 04:45:23 UTC, 1 replies.
- unsubscribe - posted by Or Rappel-Kroyzer <Or...@mobileye.com> on 2018/09/16 05:28:53 UTC, 2 replies.
- Re: issue Running Spark Job on Yarn Cluster - posted by sivasonai <si...@gmail.com> on 2018/09/16 15:23:53 UTC, 0 replies.
- please help me: when I write code to connect kafka with spark using python and I run code on jupyer there is error display - posted by hager <lo...@yahoo.com> on 2018/09/16 18:05:22 UTC, 0 replies.
- Best practices on how to multiple spark sessions - posted by unk1102 <um...@gmail.com> on 2018/09/16 18:15:06 UTC, 1 replies.
- Run spark tests on Windows/docker - posted by Shmuel Blitz <sh...@similarweb.com> on 2018/09/16 19:09:51 UTC, 1 replies.
- Subscribe Multiple Topics Structured Streaming - posted by sivaprakash <si...@gmail.com> on 2018/09/17 08:28:24 UTC, 2 replies.
- why display this error - posted by hager <lo...@yahoo.com> on 2018/09/17 19:58:12 UTC, 0 replies.
- Re: Metastore problem on Spark2.3 with Hive3.0 - posted by Dongjoon Hyun <do...@gmail.com> on 2018/09/18 03:13:56 UTC, 0 replies.
- Spark FlatMapGroupsWithStateFunction throws cannot resolve 'named_struct()' due to data type mismatch 'SerializeFromObject" - posted by Kuttaiah Robin <ku...@gmail.com> on 2018/09/18 04:12:34 UTC, 0 replies.
- Required fields in Parquet - posted by Dan Osipov <da...@applicative.io> on 2018/09/18 21:27:01 UTC, 0 replies.
- Encoder for JValue - posted by Arko Provo Mukherjee <ar...@gmail.com> on 2018/09/19 01:05:05 UTC, 2 replies.
- Time-Series Forecasting - posted by Mina Aslani <as...@gmail.com> on 2018/09/19 16:01:25 UTC, 9 replies.
- DirectFileOutputCommitter in Spark 2.3.1 - posted by Priya Ch <le...@gmail.com> on 2018/09/19 17:45:36 UTC, 1 replies.
- How to read multiple libsvm files in Spark? - posted by "Md. Rezaul Karim" <re...@insight-centre.org> on 2018/09/20 10:47:45 UTC, 1 replies.
- Question about Spark cluster memory usage monitoring - posted by "Liu, Jialin" <ji...@illinois.edu> on 2018/09/20 20:46:42 UTC, 2 replies.
- Custom SparkListener - posted by Priya Ch <le...@gmail.com> on 2018/09/21 04:40:44 UTC, 1 replies.
- Re: Live Streamed Code Review today at 11am Pacific - posted by Holden Karau <ho...@pigscanfly.ca> on 2018/09/21 06:40:14 UTC, 1 replies.
- Spark Use Case Analysis - posted by "Ambi, Aniket" <An...@harman.com> on 2018/09/21 09:00:51 UTC, 0 replies.
- Lightweight pipeline execution for single eow - posted by Jatin Puri <pu...@gmail.com> on 2018/09/21 16:58:04 UTC, 2 replies.
- How to do efficient self join with Spark-SQL and Scala - posted by Chetan Khatri <ch...@gmail.com> on 2018/09/21 19:37:41 UTC, 1 replies.
- Re: Kafka Connector version support - posted by "Shixiong(Ryan) Zhu" <sh...@databricks.com> on 2018/09/21 22:02:54 UTC, 0 replies.
- Structured Streaming together with Cassandra Queries - posted by Martin Engen <Ma...@outlook.com> on 2018/09/22 09:21:24 UTC, 0 replies.
- Watermarking without aggregation with Structured Streaming - posted by peay <pe...@protonmail.com.INVALID> on 2018/09/22 14:13:19 UTC, 3 replies.
- Use Shared Variable in PySpark Executors - posted by Soheil Pourbafrani <so...@gmail.com> on 2018/09/22 15:33:48 UTC, 2 replies.
- Failed to shuffle write - posted by yguang11 <gu...@gmail.com> on 2018/09/23 06:27:45 UTC, 0 replies.
- Is it possible to implement Vector Space Model using PySpark - posted by Soheil Pourbafrani <so...@gmail.com> on 2018/09/24 04:38:57 UTC, 0 replies.
- [Spark UI] find driver for an application - posted by bsikander <be...@gmail.com> on 2018/09/24 08:48:05 UTC, 0 replies.
- Απάντηση: Re: Question about Spark cluster memory usage monitoring - posted by kolokasis <ko...@ics.forth.gr> on 2018/09/24 10:58:36 UTC, 0 replies.
- Yarn log aggregation of spark streaming job - posted by ayushChauhan <ay...@oyorooms.com> on 2018/09/24 12:09:15 UTC, 0 replies.
- Apache Spark and Airflow connection - posted by Uğur Sopaoğlu <us...@gmail.com> on 2018/09/24 12:26:14 UTC, 0 replies.
- How to access line fileName in loading file using the textFile method - posted by Soheil Pourbafrani <so...@gmail.com> on 2018/09/24 12:53:49 UTC, 3 replies.
- can I model any arbitrary data structure as an RDD? - posted by kant kodali <ka...@gmail.com> on 2018/09/25 10:51:28 UTC, 0 replies.
- Python kubernetes spark 2.4 branch - posted by "Garlapati, Suryanarayana (Nokia - IN/Bangalore)" <su...@nokia.com> on 2018/09/25 17:23:24 UTC, 3 replies.
- [Spark SQL]: Java Spark Classes With Attributes of Type Set In Datasets - posted by ddukek <di...@placed.com> on 2018/09/25 21:27:07 UTC, 1 replies.
- can Spark 2.4 work on JDK 11? - posted by kant kodali <ka...@gmail.com> on 2018/09/25 21:31:08 UTC, 1 replies.
- [Spark SQL] why spark sql hash() are returns the same hash value though the keys/expr are not same - posted by Gokula Krishnan D <em...@gmail.com> on 2018/09/26 01:57:27 UTC, 3 replies.
- How to recursively aggregate Treelike(hierarchical) data using Spark? - posted by newroyker <ne...@gmail.com> on 2018/09/26 04:02:20 UTC, 0 replies.
- Pivot Column ordering in spark - posted by Manohar Rao <ma...@gmail.com> on 2018/09/26 08:36:48 UTC, 0 replies.
- spark.lapply - posted by Junior Alvarez <ju...@ericsson.com> on 2018/09/26 11:31:44 UTC, 2 replies.
- Creating spark Row from database values - posted by Kuttaiah Robin <ku...@gmail.com> on 2018/09/26 12:01:46 UTC, 2 replies.
- spark and STS tokens (Federation Tokens) - posted by Ashic Mahtab <as...@live.com> on 2018/09/26 12:48:02 UTC, 0 replies.
- Given events with start and end times, how to count the number of simultaneous events using Spark? - posted by Debajyoti Roy <ne...@gmail.com> on 2018/09/26 16:32:20 UTC, 1 replies.
- Fwd: Spark 2.3.1: k8s driver pods stuck in Initializing state - posted by Christopher Carney <ch...@capitalone.com> on 2018/09/26 18:11:29 UTC, 4 replies.
- [ANNOUNCE] Announcing Apache Spark 2.3.2 - posted by Saisai Shao <sa...@gmail.com> on 2018/09/27 01:57:14 UTC, 0 replies.
- Data source V2 in spark 2.4.0 - posted by AssafMendelson <as...@rsa.com> on 2018/09/27 13:10:42 UTC, 0 replies.
- PySpark: batch_df in ForeachBatch - aggregation - posted by mmuru <mm...@gmail.com> on 2018/09/27 23:22:18 UTC, 0 replies.
- Looking for some feedbacks on proposal - native support of session window - posted by Jungtaek Lim <ka...@gmail.com> on 2018/09/27 23:45:01 UTC, 0 replies.
- Need to convert Dataset to HashMap - posted by rishmanisation <ri...@gmail.com> on 2018/09/27 23:48:17 UTC, 4 replies.
- How to repartition Spark DStream Kafka ConsumerRecord RDD. - posted by Alchemist <al...@gmail.com> on 2018/09/28 09:13:28 UTC, 0 replies.
- Text from pdf spark - posted by Joel D <ga...@gmail.com> on 2018/09/28 17:10:52 UTC, 2 replies.
- Run Spark on Java 10 - posted by Ben_W <be...@gmail.com> on 2018/09/29 02:57:10 UTC, 0 replies.
- error while submitting job - posted by yuvraj singh <19...@gmail.com> on 2018/09/30 05:42:32 UTC, 1 replies.
- error in job - posted by yuvraj singh <19...@gmail.com> on 2018/09/30 06:55:32 UTC, 0 replies.
- Re: [Structured Streaming SPARK-23966] Why non-atomic rename is problem in State Store ? - posted by chandan prakash <ch...@gmail.com> on 2018/09/30 07:50:47 UTC, 0 replies.
- Pyspark Partitioning - posted by dimitris plakas <di...@gmail.com> on 2018/09/30 18:30:53 UTC, 2 replies.