You are viewing a plain text version of this content. The canonical link for it is here.
- Re: Code fails when AQE enabled in Spark 3.1 - posted by Gaspar Muñoz <gm...@datiobd.com> on 2022/02/01 07:49:40 UTC, 1 replies.
- Re: A Persisted Spark DataFrame is computed twice - posted by Gourav Sengupta <go...@gmail.com> on 2022/02/01 08:24:24 UTC, 0 replies.
- Structured Streaming - not showing records on console - posted by karan alang <ka...@gmail.com> on 2022/02/01 23:44:52 UTC, 2 replies.
- Structured Streaming on GCP Dataproc - java.lang.NoClassDefFoundError: org/apache/kafka/common/serialization/ByteArraySerializer - posted by karan alang <ka...@gmail.com> on 2022/02/02 06:50:34 UTC, 3 replies.
- [Spark K8s] Seeking Advice on Scaling Spark Cluster in Kubernetes - posted by Han Lin <hl...@twistbioscience.com> on 2022/02/02 08:04:26 UTC, 3 replies.
- [ANNOUNCE] .NET for Apache Spark™ 2.1 released - posted by Terry Kim <yu...@gmail.com> on 2022/02/02 20:24:59 UTC, 0 replies.
- GCP Dataproc - Failed to construct kafka consumer, Failed to load SSL keystore dataproc-versa-sase-p12-1.jks of type JKS - posted by karan alang <ka...@gmail.com> on 2022/02/02 23:36:46 UTC, 1 replies.
- Spark 3.1 Json4s-native jar compatibility - posted by Amit Sharma <re...@gmail.com> on 2022/02/03 19:56:37 UTC, 5 replies.
- Spark 3.1.2 full thread dumps - posted by Maksim Grinman <ma...@resolute.ai> on 2022/02/03 21:20:42 UTC, 12 replies.
- Re: DataStreamReader cleanSource option - posted by Jungtaek Lim <ka...@gmail.com> on 2022/02/04 05:50:50 UTC, 0 replies.
- Python performance - posted by Hinko Kocevar <Hi...@ess.eu.INVALID> on 2022/02/04 09:01:48 UTC, 3 replies.
- Spark on K8s : property simillar to yarn.max.application.attempt - posted by Pralabh Kumar <pr...@gmail.com> on 2022/02/04 10:20:58 UTC, 3 replies.
- Re: how can I remove the warning message - posted by Martin Grigorov <mg...@apache.org> on 2022/02/04 14:09:13 UTC, 0 replies.
- spark, autoscaling and handling node loss with autoscaling - posted by Mich Talebzadeh <mi...@gmail.com> on 2022/02/05 09:03:59 UTC, 1 replies.
- Unsubscribe - posted by "chentao@birdiexx.com" <ch...@birdiexx.com> on 2022/02/06 02:59:17 UTC, 8 replies.
- help check my simple job - posted by ca...@free.fr on 2022/02/06 09:00:50 UTC, 2 replies.
- dataframe doesn't support higher order func, right? - posted by ca...@free.fr on 2022/02/06 11:50:59 UTC, 6 replies.
- add an auto_increment column - posted by ca...@free.fr on 2022/02/07 01:27:02 UTC, 13 replies.
- Fwd: (send this email to subscribe) - posted by Madhuchaitanya Joshi <ma...@gmail.com> on 2022/02/07 02:15:42 UTC, 0 replies.
- TypeError: Can not infer schema for type: - posted by ca...@free.fr on 2022/02/07 04:09:48 UTC, 3 replies.
- foreachRDD question - posted by Bitfox <bi...@bitfox.top> on 2022/02/07 08:34:17 UTC, 0 replies.
- StructuredStreaming - foreach/foreachBatch - posted by karan alang <ka...@gmail.com> on 2022/02/07 21:05:07 UTC, 7 replies.
- question on the different way of RDD to dataframe - posted by ca...@free.fr on 2022/02/08 10:16:22 UTC, 4 replies.
- Does spark support something like the bind function in R? - posted by Andrew Davidson <ae...@ucsc.edu.INVALID> on 2022/02/08 15:55:15 UTC, 1 replies.
- Does spark have something like rowsum() in R? - posted by Andrew Davidson <ae...@ucsc.edu.INVALID> on 2022/02/08 16:01:13 UTC, 5 replies.
- Help With unstructured text file with spark scala - posted by Danilo Sousa <da...@gmail.com> on 2022/02/08 16:49:56 UTC, 8 replies.
- flatMap for dataframe - posted by frakass <ca...@free.fr> on 2022/02/09 01:55:10 UTC, 3 replies.
- Using Avro file format with SparkSQL - posted by "Karanika, Anna" <an...@illinois.edu> on 2022/02/10 03:25:28 UTC, 6 replies.
- Execution efficiency slows down as the number of CPU cores increases - posted by "15927907987@163.com" <15...@163.com> on 2022/02/10 09:30:09 UTC, 4 replies.
- data size exceeds the total ram - posted by frakass <ca...@free.fr> on 2022/02/11 09:22:29 UTC, 6 replies.
- Unable to force small partitions in streaming job without repartitioning - posted by Chris Coutinho <ch...@gmail.com> on 2022/02/11 12:22:54 UTC, 9 replies.
- how to classify column - posted by frakass <ca...@free.fr> on 2022/02/11 12:29:02 UTC, 2 replies.
- Repartitioning dataframe by file wite size and preserving order - posted by Danil Suetin <su...@protonmail.com.INVALID> on 2022/02/11 14:15:24 UTC, 0 replies.
- determine week of month from date in spark3 - posted by "Appel, Kevin" <ke...@bofa.com.INVALID> on 2022/02/11 18:41:41 UTC, 2 replies.
- Deploying Spark on Google Kubernetes (GKE) autopilot, preliminary findings - posted by Mich Talebzadeh <mi...@gmail.com> on 2022/02/11 20:34:03 UTC, 5 replies.
- Apache spark 3.0.3 [Spark lower version enhancements] - posted by Rajesh Krishnamurthy <rk...@perforce.com> on 2022/02/11 22:16:55 UTC, 5 replies.
- Unable to access Google buckets using spark-submit - posted by karan alang <ka...@gmail.com> on 2022/02/12 04:30:10 UTC, 8 replies.
- unsubscribe - posted by Basavaraj <ra...@gmail.com> on 2022/02/12 04:35:07 UTC, 0 replies.
- Failed to construct kafka consumer, Failed to load SSL keystore + Spark Streaming - posted by joyan sil <jo...@gmail.com> on 2022/02/12 16:32:50 UTC, 0 replies.
- Re: [EXTERNAL] Re: Unable to access Google buckets using spark-submit - posted by Saurabh Gulati <sa...@fedex.com.INVALID> on 2022/02/14 11:14:09 UTC, 0 replies.
- Spark 3.2.1 in Google Kubernetes Version 1.19 or 1.21 - posted by Gnana Kumar <gn...@gmail.com> on 2022/02/14 16:21:12 UTC, 2 replies.
- [MLlib]: GLM with multinomial family - posted by Surya Rajaraman Iyer <si...@medallia.com> on 2022/02/14 17:36:31 UTC, 1 replies.
- Spark kubernetes s3 connectivity issue - posted by Raj ks <ra...@gmail.com> on 2022/02/14 19:09:11 UTC, 5 replies.
- Position for 'cf.content' not found in row - posted by 潘明文 <pa...@163.com> on 2022/02/15 03:30:31 UTC, 2 replies.
- SparkStructured Streaming using withWatermark - TypeError: 'module' object is not callable - posted by karan alang <ka...@gmail.com> on 2022/02/16 06:36:37 UTC, 3 replies.
- Which manufacturers' GPUs support Spark? - posted by "15927907987@163.com" <15...@163.com> on 2022/02/16 06:59:30 UTC, 2 replies.
- Implementing circuit breaker pattern in Spark - posted by S <sh...@gmail.com> on 2022/02/16 09:17:17 UTC, 8 replies.
- restoring SQL text from logical plan - posted by Wang Cheng <34...@qq.com.INVALID> on 2022/02/16 09:19:17 UTC, 0 replies.
- Cast int to string not possible? - posted by Rico Bergmann <in...@ricobergmann.de> on 2022/02/16 16:26:23 UTC, 8 replies.
- Deploying docker images in Google Kubernetes engines - posted by Mich Talebzadeh <mi...@gmail.com> on 2022/02/16 17:45:52 UTC, 0 replies.
- [Spark SQL] Is there any free ODBC driver - posted by Rostyslav Myroshnychenko <ro...@instacart.com.INVALID> on 2022/02/16 23:17:43 UTC, 0 replies.
- Fwd: Spark 3.2.1 in Google Kubernetes Version 1.19 or 1.21 - SparkSubmit Error - posted by Gnana Kumar <gn...@gmail.com> on 2022/02/17 12:15:23 UTC, 6 replies.
- writing a Dataframe (with one of the columns as struct) into Kafka - posted by karan alang <ka...@gmail.com> on 2022/02/18 00:21:22 UTC, 0 replies.
- Encoders.STRING() causing performance problems in Java application - posted by ma...@wunderlich.com on 2022/02/18 06:42:10 UTC, 5 replies.
- GCP Dataproc - error in importing KafkaProducer - posted by karan alang <ka...@gmail.com> on 2022/02/18 07:45:57 UTC, 0 replies.
- Scala/Spark Kernel for Jupyter - posted by Artemis User <ar...@dtechspace.com> on 2022/02/18 14:34:10 UTC, 0 replies.
- Spark Explain Plan and Joins - posted by Sid Kal <fl...@gmail.com> on 2022/02/19 09:59:15 UTC, 13 replies.
- Docker images for Spark 3.1.1 and Spark 3.1.2 with Java 11 and Java 8 from docker hub - posted by Mich Talebzadeh <mi...@gmail.com> on 2022/02/20 13:50:48 UTC, 1 replies.
- Question about spark.sql min_by - posted by David Diebold <da...@gmail.com> on 2022/02/21 10:00:58 UTC, 4 replies.
- Logging to determine why driver fails - posted by "Michael Williams (SSI)" <Mi...@ssigroup.com> on 2022/02/21 14:15:09 UTC, 3 replies.
- Spark-SQL : Getting current user name in UDF - posted by "Lavelle, Shawn" <Sh...@osii.com.INVALID> on 2022/02/21 23:38:51 UTC, 1 replies.
- Need to make WHERE clause compulsory in Spark SQL - posted by Saurabh Gulati <sa...@fedex.com.INVALID> on 2022/02/22 12:34:42 UTC, 2 replies.
- Re: [EXTERNAL] Re: Need to make WHERE clause compulsory in Spark SQL - posted by Saurabh Gulati <sa...@fedex.com.INVALID> on 2022/02/22 15:33:51 UTC, 4 replies.
- TensorFlow on Spark - posted by Vijayant Kumar <Vi...@mavenir.com.INVALID> on 2022/02/23 02:51:03 UTC, 1 replies.
- RE: [E] COMMERCIAL BULK: Re: TensorFlow on Spark - posted by Vijayant Kumar <Vi...@mavenir.com.INVALID> on 2022/02/23 03:27:04 UTC, 12 replies.
- One click to run Spark on Kubernetes - posted by bo yang <bo...@gmail.com> on 2022/02/23 04:05:45 UTC, 16 replies.
- Spark 3.1.3 docker pre-built with Python Data science packages - posted by Mich Talebzadeh <mi...@gmail.com> on 2022/02/23 11:16:21 UTC, 0 replies.
- Loading .xlsx and .xlx files using pyspark - posted by Sid <fl...@gmail.com> on 2022/02/23 13:30:39 UTC, 8 replies.
- Unable to display JSON records with null values - posted by Sid <fl...@gmail.com> on 2022/02/23 17:56:42 UTC, 2 replies.
- Structured Streaming + UDF - logic based on checking if a column is present in the Dataframe - posted by karan alang <ka...@gmail.com> on 2022/02/23 20:42:40 UTC, 1 replies.
- DataTables 1.10.20 reported vulnerable in spark-core_2.13:3.2.1 - posted by vinodh palanisamy <vi...@gmail.com> on 2022/02/24 10:15:55 UTC, 1 replies.
- Consuming from Kafka to delta table - stream or batch mode? - posted by "Michael Williams (SSI)" <Mi...@ssigroup.com> on 2022/02/24 14:05:50 UTC, 2 replies.
- - posted by Luca Borin <bo...@gmail.com> on 2022/02/24 19:30:15 UTC, 0 replies.
- Non-Partition based Workload Distribution - posted by Artemis User <ar...@dtechspace.com> on 2022/02/24 20:24:02 UTC, 1 replies.
- Spark Kafka Integration - posted by "Michael Williams (SSI)" <Mi...@ssigroup.com> on 2022/02/25 19:37:23 UTC, 11 replies.
- StructuredStreaming error - pyspark.sql.utils.StreamingQueryException: batch 44 doesn't exist - posted by karan alang <ka...@gmail.com> on 2022/02/25 22:30:25 UTC, 9 replies.
- can dataframe API deal with subquery - posted by ca...@free.fr on 2022/02/26 08:00:03 UTC, 0 replies.
- Re: How to gracefully shutdown Spark Structured Streaming - posted by Mich Talebzadeh <mi...@gmail.com> on 2022/02/26 10:42:13 UTC, 1 replies.
- Issue while creating spark app - posted by rajat kumar <ku...@gmail.com> on 2022/02/26 16:10:25 UTC, 13 replies.
- Difference between windowing functions and aggregation functions on big data - posted by Sid <fl...@gmail.com> on 2022/02/27 17:29:51 UTC, 10 replies.
- Spark and Hive Metastore Authorzation - posted by "Hartwig, Jonas" <jo...@teliacompany.com> on 2022/02/28 06:57:34 UTC, 0 replies.
- Accumulator null pointer exception - posted by Abhimanyu Kumar Singh <ab...@gmail.com> on 2022/02/28 09:37:27 UTC, 0 replies.
- [Spark SQL] Null when trying to use corr() with a Window - posted by Edgar H <ka...@gmail.com> on 2022/02/28 12:50:35 UTC, 7 replies.