user@spark.apache.org, 2020-11

You are viewing a plain text version of this content. The canonical link for it is here.

- Re: [Spark Core] Vectorizing very high-dimensional data sourced in long format - posted by kevin chen <kc...@gmail.com> on 2020/11/01 09:50:17 UTC, 0 replies.
- Re: spark-submit parameters about two keytab files to yarn and kafka - posted by kevin chen <kc...@gmail.com> on 2020/11/01 14:29:52 UTC, 0 replies.
- Executors across Stages - posted by AVS Bharadwaj <19...@iitb.ac.in> on 2020/11/02 06:06:04 UTC, 0 replies.
- [UNSUB me please] - posted by northbright <no...@gmail.com> on 2020/11/02 08:53:32 UTC, 0 replies.
- Re: Integration testing Framework Spark SQL Scala - posted by Lars Albertsson <la...@mapflat.com> on 2020/11/02 13:09:34 UTC, 0 replies.
- Cannot perform operation after producer has been closed - posted by Eric Beabes <ma...@gmail.com> on 2020/11/02 16:52:57 UTC, 7 replies.
- Best way to emit custom metrics to Prometheus in spark structured streaming - posted by meetwes <me...@gmail.com> on 2020/11/02 18:12:44 UTC, 3 replies.
- Passing authentication token to the user session in Spark Thrift Server - posted by mhd wrk <mh...@gmail.com> on 2020/11/02 23:36:26 UTC, 0 replies.
- Re: Spark streaming with Kafka - posted by MohitAbbi <mo...@gmail.com> on 2020/11/03 19:25:45 UTC, 1 replies.
- Using two WriteStreams in same spark structured streaming job - posted by act_coder <ac...@gmail.com> on 2020/11/04 13:55:49 UTC, 3 replies.
- Spark reading from cassandra - posted by Amit Sharma <re...@gmail.com> on 2020/11/04 17:05:31 UTC, 2 replies.
- Confuse on Spark to_date function - posted by 杨仲鲍 <ge...@icloud.com.INVALID> on 2020/11/05 03:48:45 UTC, 1 replies.
- How does order work in Row objects when .toDF() is called? - posted by Daniel Stojanov <ma...@danielstojanov.com> on 2020/11/05 10:45:39 UTC, 0 replies.
- Need suggestions for Spark on K8S: RPC Encryption - posted by Xuan Gong <je...@gmail.com> on 2020/11/05 22:59:35 UTC, 0 replies.
- Re: Excessive disk IO with Spark structured streaming - posted by Jungtaek Lim <ka...@gmail.com> on 2020/11/05 23:46:08 UTC, 0 replies.
- Announcing .NET for Apache Spark™ 1.0 - posted by Terry Kim <yu...@gmail.com> on 2020/11/06 22:29:51 UTC, 0 replies.
- Out of memory issue - posted by Amit Sharma <re...@gmail.com> on 2020/11/08 18:35:43 UTC, 4 replies.
- Reading data slows down when Spark3.0 uses multiple cpu cores - posted by 叶新 <15...@163.com> on 2020/11/09 02:03:10 UTC, 1 replies.
- Ask about Pyspark ML interaction - posted by "Du, Yi" <YD...@archcapservices.com> on 2020/11/09 13:53:13 UTC, 2 replies.
- [Structured Streaming] Join stream of readings with collection of previous related readings - posted by "nathan.brinks" <na...@spindance.com> on 2020/11/09 16:29:13 UTC, 1 replies.
- repartition in Spark - posted by "ashok34668@yahoo.com.INVALID" <as...@yahoo.com.INVALID> on 2020/11/09 16:56:32 UTC, 1 replies.
- spark cassandra questiom - posted by adfel70 <ad...@gmail.com> on 2020/11/10 12:42:02 UTC, 1 replies.
- Distribution of spark 3.0.1 with Hive1.2 - posted by Dmitry <fr...@gmail.com> on 2020/11/10 13:41:22 UTC, 1 replies.
- Creating hive table through df.write.mode("overwrite").saveAsTable("DB.TABLE") - posted by Mich Talebzadeh <mi...@gmail.com> on 2020/11/10 15:25:05 UTC, 0 replies.
- Spark Parquet file size - posted by Tzahi File <tz...@ironsrc.com> on 2020/11/10 15:55:45 UTC, 0 replies.
- DStreams stop consuming from Kafka - posted by Razvan-Daniel Mihai <ra...@gmail.com> on 2020/11/10 17:02:07 UTC, 1 replies.
- Blacklisting in Spark Stateful Structured Streaming - posted by Eric Beabes <ma...@gmail.com> on 2020/11/10 22:36:37 UTC, 2 replies.
- Pyspark application hangs (no error messages) on Python RDD .map - posted by Daniel Stojanov <ma...@danielstojanov.com> on 2020/11/11 05:30:36 UTC, 0 replies.
- Spark 2.4 lifetime - posted by Netanel Malka <ne...@gmail.com> on 2020/11/11 06:23:27 UTC, 1 replies.
- Slow insert into overwrite in spark in object store backed hive tables - posted by joyan sil <jo...@gmail.com> on 2020/11/11 17:23:33 UTC, 0 replies.
- spark UI storage tab - posted by Amit Sharma <re...@gmail.com> on 2020/11/11 21:40:25 UTC, 0 replies.
- Spark[SqL] performance tuning - posted by Lakshmi Nivedita <kl...@gmail.com> on 2020/11/12 09:48:19 UTC, 0 replies.
- Spark Dataset withColumn issue - posted by Vikas Garg <sp...@gmail.com> on 2020/11/12 14:26:08 UTC, 6 replies.
- Path of jars added to a Spark Job - spark-submit // // Override jars in spark submit - posted by Dominique De Vito <dd...@gmail.com> on 2020/11/12 16:01:45 UTC, 4 replies.
- Purpose of type in pandas_udf - posted by Daniel Stojanov <ma...@danielstojanov.com> on 2020/11/12 23:19:36 UTC, 1 replies.
- Refreshing Data in Spark Memory (DataFrames) - posted by Arti Pande <pa...@gmail.com> on 2020/11/13 17:41:54 UTC, 3 replies.
- Spark on Kubernetes - posted by Arti Pande <pa...@gmail.com> on 2020/11/13 17:49:27 UTC, 0 replies.
- PyCharm IDE throws spark error - posted by Mich Talebzadeh <mi...@gmail.com> on 2020/11/13 22:23:46 UTC, 2 replies.
- Submitting extra jars on spark applications on yarn with cluster mode - posted by Pedro Cardoso <pe...@feedzai.com> on 2020/11/14 12:25:08 UTC, 2 replies.
- Re:Re: Flink 1.11 not showing logs - posted by 马阳阳 <ma...@163.com> on 2020/11/16 06:36:59 UTC, 0 replies.
- spark-sql on windows throws Exception in thread "main" java.lang.UnsatisfiedLinkError: - posted by Mich Talebzadeh <mi...@gmail.com> on 2020/11/16 21:21:23 UTC, 0 replies.
- Single spark streaming job to read incoming events with dynamic schema - posted by act_coder <ac...@gmail.com> on 2020/11/17 02:47:00 UTC, 0 replies.
- Announcing Hyperspace v0.3.0 - an indexing subsystem for Apache Spark™ - posted by Terry Kim <yu...@gmail.com> on 2020/11/18 00:46:10 UTC, 0 replies.
- RE: [EXTERNAL] Announcing Hyperspace v0.3.0 - an indexing subsystem for Apache Spark™ - posted by Rahul Potharaju <ra...@microsoft.com.INVALID> on 2020/11/18 04:54:47 UTC, 0 replies.
- Can all the parameters of hive be used on spark sql? - posted by Gang Li <lg...@gmail.com> on 2020/11/18 06:42:37 UTC, 0 replies.
- Disable parquet metadata for count - posted by Gary Li <ga...@outlook.com> on 2020/11/18 06:52:44 UTC, 0 replies.
- Spark Exception - posted by Amit Sharma <re...@gmail.com> on 2020/11/18 17:05:05 UTC, 3 replies.
- Spark SQL check timestamp with other table and update a column. - posted by anbutech <an...@outlook.com> on 2020/11/19 05:13:16 UTC, 1 replies.
- Need Unit test complete reference for Pyspark - posted by Sachit Murarka <co...@gmail.com> on 2020/11/19 06:37:47 UTC, 1 replies.
- Spark 3.0.1 new Proleptic Gregorian calendar - posted by Saurabh Gulati <sa...@fedex.com.INVALID> on 2020/11/19 16:40:18 UTC, 1 replies.
- Re: Kafka Topic to Parquet HDFS with Structured Streaming - posted by AlbertoMarq <al...@gmail.com> on 2020/11/19 22:17:05 UTC, 0 replies.
- unsubscribe - posted by youso b <bo...@gmail.com> on 2020/11/20 14:36:29 UTC, 0 replies.
- Cache not getting cleaned. - posted by Amit Sharma <re...@gmail.com> on 2020/11/21 21:25:20 UTC, 4 replies.
- How to apply ranger policies on Spark - posted by joyan sil <jo...@gmail.com> on 2020/11/23 18:03:40 UTC, 3 replies.
- How to submit a job via REST API? - posted by Zhou Yang <zh...@outlook.com> on 2020/11/24 02:33:55 UTC, 4 replies.
- how to manage HBase connections in Executors of Spark Streaming ? - posted by big data <bi...@outlook.com> on 2020/11/24 05:58:03 UTC, 1 replies.
- Building High-performance Lake for Spark using OSS, Hudi, Alluxio - posted by Bin Fan <fa...@gmail.com> on 2020/11/24 06:10:02 UTC, 0 replies.
- Running the driver on a laptop but data is on the Spark server - posted by Ryan Victory <rv...@gmail.com> on 2020/11/25 14:51:27 UTC, 8 replies.
- Re: Spark Structured Streaming XML content - posted by akstremepro <ak...@gmail.com> on 2020/11/26 08:36:50 UTC, 0 replies.
- Error in PyCharm with PySpark - posted by Mich Talebzadeh <mi...@gmail.com> on 2020/11/26 09:34:18 UTC, 0 replies.
- Stream-static join : Refreshing subset of static data / Connection pooling - posted by Geervan Hayatnagarkar <pa...@gmail.com> on 2020/11/26 13:36:10 UTC, 7 replies.
- converting dataframe from one format to another in spark structured streaming - posted by act_coder <ac...@gmail.com> on 2020/11/27 05:39:48 UTC, 1 replies.
- Separating storage from compute layer with Spark and data warehouses offering ML capabilities - posted by Mich Talebzadeh <mi...@gmail.com> on 2020/11/29 09:08:13 UTC, 0 replies.